rabpro
rabpro (core.py)
Class for running rabpro commands on your data.
- class rabpro.core.profiler(coords, da=None, name='unnamed', path_results=None, verbose=True, update_gee_metadata=True)
The profiler class organizes data and methods for using rabpro. This is a parent class to the centerline and point classes which inherit profiler methods and attributes.
- Attributes:
- coordstuple OR list OR str OR GeoDataFrame
Coordinates of point(s) to delineate. A single point may be provided as a (lat, lon) tuple. Point(s) may also be provided via a shapefile or .csv file. If provided as .csv file, the columns must be labeled ‘latitude’ and ‘longitude’. If a shapefile path is provided, the shapefile may also contain columns for widths and/or along-valley distances. These columns should have “width” and “distance” somewhere in their column names, respectively, in order for rabpro to exploit them.
- danumber
Represents the drainage area in square kilometers of the downstream-most point in the provided coords.
- namestr
Name of location/river, used for filename saving purposes only.
- verbosebool
If True, will print updates as processing progresses.
- update_gee_metadatabool
If True, will attempt to download the latest GEE Data Catalog metadata.
Methods
delineate_basin:
Computes the watersheds for each lat/lon pair
elev_profile:
Compute the elevation profile
basin_stats:
Computes watershed statistics
export:
Exports computed data
- basin_stats(datasets, gee_feature_path=None, reducer_funcs=None, folder=None, test=False)
Computes watershed statistics.
- Parameters:
- datasetslist of Dataset
Datasets to compute stats over. See the subbasin_stats.Dataset class
- reducer_funcslist of functions
List of functions to apply to each feature over each dataset. Each function should take in an ee.Feature() object. For example, this is how the function and header are applied on a feature: feature.set(f.__name__, function(feature))
- folderstr
Google Drive folder to store results in
- delineate_basin(search_radius=None, map_only=False, force_merit=False, force_hydrobasins=False)
Computes the watersheds for each lat/lon pair and adds their drainage areas to the self.gdf GeoDataFrame.
There are two methods available for delineating basins: HydroBASINS and MERIT. HydroBASINS is appropriate for large basins (1000 km^2 and larger), and MERIT can provide more detailed basin delineations for smaller basins. The method used depends on the size of the basin, which is interpreted via the provided drainage area. If no drainage area was provided, HydroBASINS will be used. Otherwise, if the provided drainage area is less than 1000 km^2, MERIT will be used. MERIT may also be forced for larger basins using the ‘force_merit’ argument when instantiating the profiler.
- Parameters:
- search_radiusnumeric, optional
in meters, by default None
- map_onlybool, optional
If we only want to map the point and not delineate the basin, by default False
- force_meritbool, optional
Forces the use of MERIT to delineate basins.
- force_hydrobasinsbool, optional
Forces the use of HydroBASINS to delineate basins.
- Warns:
- UserWarning
If no drainage area provided
- RuntimeWarning
If MERIT or HydroBasins data is not available, or other issues
- elev_profile(dist_to_walk_km=None)
Compute the elevation profile. The profile is computed such that the provided coordinate is the centerpoint (check if this is true).
- Parameters:
- dist_to_walk_kmnumeric
The distance to trace the elevation profile from the provided point. This distance applies to upstream and downstream–i.e. the total profile distance is twice this value. If not specified, will be automatically computed as 10 channel widths from provided DA value if not specified OR 5 km, whichever is larger.
- export(what='all')
Exports data computed by rapbro.
- Parameters:
- whatlist, optional
Which data should be exported? Choose from ‘all’ - all computed data ‘flowline’ - flowline json is exported with elevation and distance attributes ‘watershed’ - watershed polygon The default is ‘all’.
Data utility functions (data_utils.py)
- rabpro.data_utils.does_hydrobasins_exist(datapaths)
Checks if level 1 and level 12 HydroBasins data are available.
- rabpro.data_utils.does_merit_exist(datapaths)
Checks if any MERIT tiles are in the MERIT data directories. Also checks if the vrts have been built. Returns the number of MERIT layers that have data (maximum of 4).
- rabpro.data_utils.download_hydrobasins(datapath=None, proxy=None)
Downloads HydroBASINS to the proper location, then unzips and deletes zip file.
- Parameters:
- datapathstr or Path, optional
Directory to download and unzip HydroBasins data into; does not include filename. If None, will use the rabpro default.
- proxystr, optional
Pass a proxy to requests.get, by default None
Examples
from rabpro import data_utils data_utils.hydrobasins()
- rabpro.data_utils.download_merit_hydro(merit_tile, username, password, proxy=None)
Downloads the four required rabpro layers for a given MERIT-Hydro 30 degree x 30 degree tile. Each tile contains 5 degree x 5 degree geotiffs. Data are downloaded and untarred into the location and filestructure rabpro expects. Virtual rasters are rebuilt after downloading this data.
- Parameters:
- merit_tilestr
The MERIT-Hydro tile to fetch. e.g. “n45w100”
- usernamestr
The MERIT-Hydro username (must request from MERIT-Hydro developers).
- passwordstr
The MERIT-Hydro password (must request from MERIT-Hydro developers).
- proxystr, optional
A proxy to pass to requests Session.
Elevation profile computation (elev_profile.py)
- rabpro.elev_profile.main(gdf, dist_to_walk_km, verbose=False, nrows=50, ncols=50)
Compute the elevation profile. The profile is computed such that the provided coordinate is the centerpoint (check if this is true).
- Parameters:
- gdfGeoDataFrame
Starting point geometry. Should have a column called ‘DA’ that stores drainage areas. Should be in EPSG 4326 for use of the Haversine formula
- dist_to_walk_kmnumeric
Distance in kilometers to walk up- and downstream of the provided, mapped point to resolve the flowline
- verbosebool
Defaults to False.
- nrowsint
Number of rows in the neighborhood of the point to search. Defaults to 50.
- ncolsint
Number of columns in the neighborhood of the point to search. Defaults to 50.
- Returns:
- gdfGeoDataFrame
The original point layer geometry.
- flowlineGeoDataFrame
The elevation profile geometry.
- Warns:
- DeprecationWarning
If a list of coordinates is passed
Examples
import rabpro coords = (56.22659, -130.87974) rpo = rabpro.profiler(coords, name='basic_test') gdf, flowline = rabpro.elev_profile.main(rpo.gdf, dist_to_walk_km=5)
MERIT utilities (merit_utils.py)
Utility functions for dealing with MERIT datasets
- rabpro.merit_utils.idcs_to_geopolygons(idcs, gdobj, buf_amt=0.001)
Given a list of of pixel indices within a raster specified by gdobj, creates georeferenced polygons of the blobs formed by the union of the pixels.
“Wrapping” is also checked - this is to handle cases where the dateline meridian is crossed and return is therefore a set of polygons rather than a continuous one.
- Parameters:
- idcslist of integers
Pixel indices within the raster specified by gdobj that should be included in the polygon.
- gdobjosgeo.gdal.Dataset
Object created by gdal.Open() on a raster or virtual raster.
- buf_amtnumeric, optional
Amount by which to buffer pixels before unioning-helps close tiny gaps. By default 0.001.
- Returns:
- pgonslist of shapely.geometry.Polygon
List of georeferenced polygons; one per blob of indices.
- crossingbool
If True, the polygons represent those that cross the dateline meridian (i.e. 180 degrees -> -180 degrees) and have been split.
- rabpro.merit_utils.map_cl_pt_to_flowline(lonlat, da_obj, nrows, ncols, da=None, basin_pgon=None, fdr_obj=None, fdr_map=None)
Maps a point of known drainage area to a flowline of a flow accumulation grid. Returns the row, col of the mapped-to pixel. User may provide a basin polygon (in EPSG:4326) if already known. This polygon will be used to ensure the mapped-to-flowline is the correct one. If the basin polygon is provided, a flow directions object and its mapping must also be provided as well as the drainage area.
- Parameters:
- lonlatlist or tuple
Two-element list/tuple containing (longitude, latitude) coordinates of the point to map to a flowline.
- da_objosgeo.gdal.Dataset
Flow accumulation object. Created by gdal.Open() on raster containing flow accumulations.
- nrowsint
Number of rows in the neighborhood of the point to search.
- ncolsint
Number of columns in the neighborhood of the point to search.
- dafloat, optional
Drainage area of the point/gage if known. Units should correspond to those in da_obj, typically km^2. By default None.
- basin_pgonshapely.geometry.polygon.Polygon, optional
Polygon of the watershed of the point, if known. By default None.
- fdr_objosgeo.gdal.Dataset, optional
Flow direction object. Created by gdal.Open() on raster containing flow directions. Must be specified in order to use the basin_pgon. By default None.
- fdr_maplist, optional
8-entry list corresponding to the numeric value for flow directions. The list should take the form [NW, N, NE, W, E, SW, S, SE]. By default None.
- Returns:
- (c_mapped, r_mapped)tuple or None
x and y coordinates of the mapped points. If no mapping is possible, None is returned. The x and y coordinates are with respect to the fac_obj.
- solve_methodint
Indicates the reason why mapping succeeded/failed: 1 - (success) DA provided; a nearby flowline pixel was found within 15%
of the provided DA
- 2 - (success) DA provided; match was found on a nearby flowline that is
within our DA certainty bounds
3 - (success) basin polygon provided; a mappable flowline was found 4 - (success) DA not provided; mapped to the nearest flowline (>1km^2) 5 - (fail) DA not provided; no nearby flowlines exist 6 - (fail) DA provided; but no nearby DAs were close enough to map to 7 - (fail) basin polygon provided; but no nearby DAs were within the allowable
range
- 8 - (fail) basin polygon provided; no flowlines were 25% within the provided
basin
- Warns:
- UserWarning
No drainage area, flow, directions, or flow directions map provided
- rabpro.merit_utils.neighborhood_vals_from_raster(cr, shape, vrt_obj, nodataval=<Mock name='mock.nan' id='139955189116144'>, wrap=None)
Queries a (virtual) raster object to return an array of neighbors surrounding a given point specified by cr (column, row). A shape can be provided to return as large of a neighborhood as desired; both dimensions must be odd. This function is almost always unnecessary and could be replaced with a single call to gdal’s ReadAsArray(), except that throws errors when requesting a neighborhood that is beyond the boundaries of the raster. Also note that requests for negative offsets do not throw errors, which is dangerous. This function checks for those cases and handles them.
An option is provided to ‘wrap’ in cases where neighborhoods are requested beyond the bounds of the raster. In these cases, the raster is effectively “grown” by appending copies of itself to ensure no nodata are returned. (This isn’t actually how the code works, just the simplest explanation.)
- Parameters:
- crtuple
(column, row) indices within the virtual raster specifying the point around which a neighborhood is requested.
- shapetuple
Two-element tuple (nrows, ncols) specifying the shape of the neighborhood around cr to query.
- vrt_objgdal.Dataset
Dataset object pointing to the raster from which to read; created by gdal.Open(path_to_raster).
- nodatavalobject
Value to assign neighbors that are beyond the bounds of the raster. By default np.nan.
- wrapstr or None
String of ‘h’, ‘v’, or ‘hv’ denoting if horizontal and/or vertical wrapping is desired. If None, no wrapping is performed. By default None.
- Returns:
- Ivalsnp.array
Array of same dimensions as shape containing the neighborhood values.
- rabpro.merit_utils.trace_flowpath(fdr_obj, da_obj, cr_stpt, dist_to_walk_km=1, fmap=[32, 64, 128, 16, 1, 8, 4, 2])
Returns a flowpath of pixels up- and downstream of cr_stpt a distance of dist_to_walk_km in each direction (unless the end of the flowpath is reached). Flowpath is returned US->DS ordered.
- Parameters:
- fdr_objosgeo.gdal.Dataset
flow direction object opened with gdal.Open(). Assumes flow direction symbology matches MERIT-Hydro: 32 64 128 16 1 8 4 2
- da_objosgeo.gdal.Dataset
drainage area object opened with gdal.Open()
- cr_stptnumpy.ndarray
column, row of point to start walk
- cr_enptnumpy.ndarray, optional
column, row of point to end walk. By default None
- n_stepsint, optional
number of steps (pixels) to walk before halting. By default None
- fmaplist, optional
[NW, N, NE, W, E, SW, S, SE], by default [32, 64, 128, 16, 1, 8, 4, 2]
- Returns:
- tuple
(rows, cols) of flowline.
Basin statistics (basin_stats.py)
Computes basin statistics using Google Earth Engine.
- class rabpro.basin_stats.Dataset(data_id, band='None', resolution=None, start=None, end=None, stats=None, time_stats=None, mask=False, mosaic=False, prepend_label='', gee_type=None)
Represents one band of a GEE dataset with parameters specifying how to compute statistics.
- Attributes:
- data_idstr
Google Earth Engine dataset asset id
- bandstr, optional
Google Earth Engine dataset band name
- resolutionint, optional
Desired resolution in meters of the calculation over the dataset. Defaults to native resolution of the dataset.
- startstr, optional
Desired start date of data in ISO format: YYYY-MM-DD. Defaults to None which GEE interprets as the dataset start.
- endstr, optional
Desired end date of data in ISO format: YYYY-MM-DD. Defaults to None which GEE interprets as the dataset end.
- statslist, optional
List of desired stats to compute: min, max, range, mean, count, std, sum, and percentiles in the following format: pct1, pct90, pct100, etc. Computes [“count”, “mean”] by default.
- time_statslist, optional
List of desired stats to compute through time.
- maskbool, optional
Whether or not to mask out water in the dataset using the Global Surface Water occurrence band. By default False.
- mosaicbool, optional
If True, the imageCollection in data_id will be mosaiced first. Useful when the imageCollection is a set of tiled images. Do not use for time-varying imageCollections.
- prepend_labelstr, optional
Text to prepend to the exported statistics file.
- gee_typestr, optional
Either ‘image’ or ‘imagecollection’.
- __str__()
Return str(self).
- rabpro.basin_stats.compute(dataset_list, gee_feature_path=None, gee_featureCollection=None, basins_gdf=None, reducer_funcs=None, folder=None, filename=None, verbose=False, test=False, validate_dataset_list=True)
Compute subbasin statistics for each dataset and band specified. One of gee_feature_path, gee_featureCollection, or basins_gdf must be specified.
- Parameters:
- dataset_listlist of Datasets
List of rabpro Dataset objects to compute statistics over.
- basins_gdfGeoDataFrame
Table of subbasin geometries.
- gee_feature_pathstring
Path to a GEE feature collection
- gee_featureCollectionee.FeatureCollection object
FeatureCollection object.
- reducer_funcslist of functions, optional
List of functions to apply to each feature over each dataset. Each function should take in an ee.Feature() object. For example, this is how the function and header are applied on a feature: feature.set(f.__name__, function(feature))
- verbosebool, optional
By default False.
- folderstr, optional
Google Drive folder to store results in, by default top-level root.
- filenamestr, optional
Name of the GEE export file.
- testbool, optional
Return results to the active python session in addition to GDrive
- validate_dataset_list: bool, optional
Validate the dataset_list against a scraped version of the GEE catalog?
Examples
import rabpro import numpy as np import geopandas as gpd from shapely.geometry import box from rabpro.basin_stats import Dataset total_bounds = np.array([-85.91331249, 39.42609864, -85.88453019, 39.46429816]) gdf = gpd.GeoDataFrame({"idx": [1], "geometry": [box(*total_bounds)]}, crs="EPSG:4326") # defaults data, task = rabpro.basin_stats.compute( [ Dataset( "JRC/GSW1_4/MonthlyRecurrence", "monthly_recurrence", ) ], basins_gdf = gdf, test = True, ) # with time_stats specified data, task = rabpro.basin_stats.compute( [ Dataset( "JRC/GSW1_4/MonthlyRecurrence", "monthly_recurrence", time_stats = ["median"] ) ], basins_gdf = gdf, test = True, )
- rabpro.basin_stats.dataset_to_filename(prepend, data_id, band=None)
Examples
import rabpro rabpro.basin_stats.dataset_to_filename("ECMWF/ERA5_LAND/MONTHLY", "temperature_2m")
- rabpro.basin_stats.fetch_gee(url_list, prepend_list, col_drop_list=[], col_protect_list=['id_basin', 'id_outlet', 'idx', 'id', 'vote_id'], col_drop_defaults=['DA', 'count', '.geo', 'da_km2'])
Download and format data from GEE urls
- Parameters:
- url_listlist
list of urls returned from compute
- prepend_listlist
tags to pre-append to corresponding columns, of length url_list
- col_drop_listlist, optional
custom columns to drop, by default []
- col_protect_listlist, optional
columns to avoid tagging, by default [“id_basin”, “id_outlet”, “idx”, “id”, “vote_id”]
- col_drop_defaultslist, optional
built-in columns to drop, by default [“DA”, “count”, “.geo”, “system:index”, “da_km2”]
- Returns:
- DataFrame
Examples
import numpy as np import geopandas as gpd from shapely.geometry import box import rabpro from rabpro.basin_stats import Dataset total_bounds = np.array([-85.91331249, 39.42609864, -85.88453019, 39.46429816]) basin = gpd.GeoDataFrame({"idx": [0], "geometry": [box(*total_bounds)]}, crs="EPSG:4326") dataset_list = [ Dataset("ECMWF/ERA5_LAND/MONTHLY", "temperature_2m", stats=["mean"], time_stats = ["median"]), Dataset("NASA/GPM_L3/IMERG_MONTHLY_V06", "precipitation", stats=["mean"], time_stats = ["median"]), ] urls, tasks = rabpro.basin_stats.compute( dataset_list, basins_gdf=basin, folder="rabpro" ) tag_list = ["temperature", "precip"] data = rabpro.basin_stats.fetch_gee(urls, tag_list, col_drop_list = ["system:index"])
- rabpro.basin_stats.format_gee(df_list, prepend_list, col_drop_list=[], col_protect_list=['id_basin', 'id_outlet', 'idx', 'id', 'vote_id'], col_drop_defaults=['DA', 'count', '.geo', 'da_km2'])
- Parameters:
- df_listlist
list of DataFrames
- prepend_listlist
tags to pre-append to corresponding columns, of length url_list
- col_drop_listlist, optional
custom columns to drop, by default []
- col_protect_listlist, optional
columns to avoid tagging, by default [“id_basin”, “id_outlet”, “idx”, “id”, “vote_id”]
- col_drop_defaultslist, optional
built-in columns to drop, by default [“DA”, “count”, “.geo”, “system:index”, “da_km2”]
- Returns:
- DataFrame
- rabpro.basin_stats.image(dataset_list, gee_feature_path=None, basins_gdf=None, categorical=[False], verbose=False)
Download a GEE raster as a GeoTiff clipped to a basins GeoDataFrame
- Parameters:
- dataset_listlist of Datasets
List of Dataset objects to compute statistics over.
- gee_feature_pathstr, optional
Path to a GEE feature collection, by default None
- basins_gdfGeoDataFrame, optional
Table of subbasin geometries, by default None
- categoricallist, optional
By default [False]
- verbosebool, optional
By default False
- Returns:
- list
of GeoTiff download urls
Examples
import rabpro from rabpro.basin_stats import Dataset import numpy as np from shapely.geometry import box total_bounds = np.array([-85.91331249, 39.2, -85.5, 39.46429816]) gdf = gpd.GeoDataFrame({"idx": [1], "geometry": [box(*total_bounds)]}, crs="EPSG:4326") dataset_list = [ Dataset("ECMWF/ERA5_LAND/MONTHLY", "temperature_2m", time_stats=["median"]) ] urls, tasks = rabpro.basin_stats.image(dataset_list, basins_gdf=gdf) basin_stats._fetch_raster(urls[0])
Basin delineation (basins.py)
Functions to calculate basin geometries.
- rabpro.basins.load_continent_basins(gdf, level_one, level_twelve)
Load a HydroBasins continent polygon
- Parameters:
- gdfGeoDataFrame
Geometry data. Should be in EPSG:4326.
- level_onestr
Path to level 1 HydroBasins data
- level_twelvestr
Path to level 12 HydroBasins data
- Returns:
- GeoDataFrame
HydroBasins
- Warns:
- UserWarning
If coordinate does not lie in the HydroBasins polygons
- rabpro.basins.main_hb(gdf, verbose=False)
Calculates basins using HydroBASINS
- Parameters:
- gdfGeoDataFrame
Contains geometry and DA (drainage area) columns.
- verbosebool, optional
By default False
- Returns:
- subbasins_gdfGeoDataFrame
Contains subbasin geometries
- sb_inc_gdfGeoDataFrame
Contains the polygons of the incremental catchments. The upstream-most basin will be the largest polygon in most cases, but that depends on the input centerline.
- cl_dasnumpy.ndarray
Drainage areas
- Warns:
- RuntimeWarning
If gdf has no CRS defined
Examples
import rabpro coords = (56.22659, -130.87974) rpo = rabpro.profiler(coords, name='basic_test') test = rabpro.basins.main_hb(rpo.gdf)
- rabpro.basins.main_merit(gdf, da, nrows=51, ncols=51, map_only=False, verbose=False)
Calculates basins using MERIT
- Parameters:
- gdfGeoDataFrame
Centerline coordinates
- danumeric
Drainage area in km^2
- nrowsint
by default 51
- ncolsint
by default 51
- map_onlybool
If True, will map the coordinate to a MERIT flowline but not delineate the basin. The default is False.
- verbosebool
by default False
- Returns:
- basinsGeoDataFrame
The delineated watershed
- mappeddict
Information about the mapping quality.
Utility functions (utils.py)
- rabpro.utils.area_4326(pgons_4326)
Given a list of shapely polygons in EPSG:4326, compute the area in km^2. Only returns the area of the perimeter; does not yet account for holes in the polygon. Mutlipolygons are supported.
- rabpro.utils.build_gee_vector_asset(basins, out_path='basins.zip')
Create zipped shapefile for uploading as a Google Earth Engine vector asset.
- Parameters:
- basinsGeoDataFrame
- out_pathstr, optional
by default “basins.zip”
- Returns:
- str
path to zip file
Examples
from rabpro import utils import geopandas as gpd basins = gpd.read_file("tests/results/merit_test/subbasins.json") utils.build_gee_vector_asset(basins)
- rabpro.utils.build_virtual_rasters(datapaths, skip_if_exists=False, verbose=True, **kwargs)
Builds virtual rasters on the four MERIT-Hydro tilesets.
- Parameters:
- datapaths: dict
Contains the paths to the data. Generate with get_datapaths().
- skip_if_exists: bool, optional
If True, will not rebuild the virtual raster if one already exists. The default is False.
- verbose: bool, optional
If True, will provide updates as virtual rasters are built.
- **kwargs
Arguments passed to the build_vrt function.
- Warns:
- RuntimeWarning
Missing data
Examples
from rabpro import utils d_paths = utils.get_datapaths() utils.build_virtual_rasters(d_paths, extents=[-180.00041666666667, 179.99958333333913, 84.99958333333333, -60.000416666669], verbose=False)
- rabpro.utils.build_vrt(tilespath, clipper=None, extents=None, outputfile=None, nodataval=None, res=None, sampling='nearest', ftype='tif', separate=False, quiet=False)
Creates a text file for input to gdalbuildvrt, then builds vrt file with same name. If output path is not specified, vrt is given the name of the final folder in the path.
- Parameters:
- tilespathstr
the path to the file (or folder of files) to be clipped– if tilespath contains an extension (e.g. .tif, .vrt), then that file is used. Otherwise, a virtual raster will be built of all the files in the provided folder. if filespath contains an extension (e.g. .tif, .vrt), filenames of tiffs to be written to vrt. This list can be created by tifflist and should be in the same folder
- clipperstr, optional
path to a georeferenced image, vrt, or shapefile that will be used to clip. By default None
- extentslist, optional
the extents by which to crop the vrt. Extents should be a 4 element list: [left, right, top, bottom] in the same projection coordinates as the file(s) to be clipped. By default None
- outputfilestr, optional
path (including filename w/ext) to output the vrt. If none is provided, the vrt will be saved in the ‘filespath’ path. By default None
- nodatavalint, optional
value to be masked as nodata, by default None
- resflt, optional
resolution of the output vrt (applied to both x and y directions), by default None
- samplingstr, optional
resampling scheme (nearest, bilinear, cubic, cubicspline, lanczos, average, mode), by default “nearest”
- ftypestr, optional
“tif” if building from a list of tiffs, or “vrt” if building from a vrt, by default “tif”
- separatebool, optional
See separate argument to gdalbuildvrt, by default False
- quietbool, optional
Set True to print progress, by default False
- Returns:
- str
path of the built virtual raster
- Raises:
- TypeError
Unsupported file type passed in ‘ftype’
- RuntimeError
No files found to build raster or raster build fails
- rabpro.utils.coords_to_merit_tile(lon, lat)
Identify MERIT-Hydro “tiles” of interest. See http://hydro.iis.u-tokyo.ac.jp/~yamadai/MERIT_Hydro/
- Parameters:
- lonfloat
- latfloat
Examples
from rabpro import utils utils.coords_to_merit_tile(178, -17) # > "s30e150" utils.coords_to_merit_tile(-118, 32) # > "n30w120" utils.coords_to_merit_tile(-97.355, 45.8358) # > "n30w120"
- rabpro.utils.crop_binary_coords(coords)
Crops an array of (row, col) coordinates (e.g. blob indices) to the smallest possible array. Taken from RivGraph.im_utils
- Parameters:
- coords: np.array
N x 2 array. First column are rows, second are columns of pixel coordinates.
- Returns:
- I: np.array
Image of the cropped coordinates, plus padding if desired.
- clipped: list
Number of pixels in [left, top, right, bottom] direction that were clipped. Clipped returns the indices within the original coords image that define where I should be positioned within the original image.
- rabpro.utils.dist_from_da(da, nwidths=20)
Returns the along-stream distance of a flowline to resolve for a given DA. An empirical formula provided by https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2013WR013916 (equation 15) is used to estimate width.
- Parameters:
- dafloat or numeric
drainage area in km^2
- nwidthsnumeric
number of channel widths to set distance
- Returns:
- dist: float
Distance in kilometers that represents nwidths*W_bankfull where W_bankfull computed according to Wilkerson et al., 2014.
- rabpro.utils.get_datapaths(root_path=None, config_path=None, force=False, update_gee_metadata=False)
Returns a dictionary of paths to all data that rabpro uses. Also builds virtual rasters for MERIT data and downloads latest version of GEE catalog.
- Parameters:
- root_path: string, optional
Path to rabpro Data folder that contains the HydroBASINS, MERIT-Hydro, and/or gee catalog jsons. Will read from an environment variable “RABPRO_DATA”. If this variable is not set, uses appdirs to create a local data directory. This path is the parent directory for the MERIT and HydroBasins data directories.
- config_path: string, optional
Path to rabpro config folder. Will read from an environment variable “RABPRO_CONFIG”. If not set, uses appdirs to create local directory.
- force: boolean, optional
Set True to override datapath caching. Otherwise only fetched once per py session.
- update_gee_metadata: boolean, optional
If True, will attempt to download the latest GEE dataset metadata.
- Returns:
- dict
contains paths to all data that rabpro uses
Examples
from rabpro import utils utils.get_datapaths()
- rabpro.utils.get_exportpaths(name, basepath=None, overwrite=False)
Returns a dictionary of paths for exporting rabpro results. Also creates “results” folders when necessary.
- Parameters:
- namestr
Name of directory to create within “results” directory.
- basepathstr, optional
path to put “results” directory. By default None. If None, creates in current working directory.
- overwritebool, optional
overwrite “name” directory, by default False
- Returns:
- dict
contains paths to all output that rabpro generates
- rabpro.utils.haversine(lats, lons)
Computes distances between latitude and longitude pairs of points.
- Parameters:
- latsnumpy array (or interpretable by numpy)
latitude values
- lonsnumpy array (or interpretable by numpy)
longitude values
- Returns:
- numpy.ndarray
Distance in meters between each point defined by lats, lons.
- rabpro.utils.lonlat_plus_distance(lon, lat, dist, bearing=0)
Returns the lon, lat coordinates of a point that is dist away (dist in km) and with a given bearing (in degrees, 0 is North). Uses a Haversine approximation.
- rabpro.utils.raster_extents(raster_path)
Output raster extents as [xmin, xmax, ymin, ymax]
- Parameters:
- raster_pathstr
Path to file
- Returns:
- list
[xmin, xmax, ymin, ymax]
Examples
from rabpro import utils utils.raster_extents(utils.get_datapaths(rebuild_vrts=False)["DEM_fdr"])
- rabpro.utils.union_gdf_polygons(gdf, idcs, buffer=True)
Given an input geodataframe and a list of indices, return a shapely geometry that unions the geometries found at idcs into a single shapely geometry object.
This function also buffers each polygon slightly, then un-buffers the unioned polygon by the same amount. This is to avoid errors associated with floating-point round-off; see here: https://gis.stackexchange.com/questions/277334/shapely-polygon-union-results-in-strange-artifacts-of-tiny-non-overlapping-area
- Parameters:
- gdfGeoDataFrame
Geometries to combine
- idcslist
Indicates which geometris in ‘gdf’ to combine
- bufferbool, optional
buffer polygons, by default True
- Returns:
- shapely.geometry object
Union of passed geometries
- rabpro.utils.upload_gee_tif_asset(tif_path, gee_user, gcp_bucket, title, gcp_folder='', gee_folder='', time_start='', epsg='4326', description='', citation='', gcp_upload=True, gee_upload=True, dry_run=False, gee_force=False)
Upload a GeoTIFF file as a GEE raster asset
- Parameters:
- tif_pathstr
Path to GeoTIFF file
- gee_userstr
Google Earth Engine user name
- gcp_bucketstr
Google Cloud Platform bucket url
- titlestr
GEE asset title
- gcp_folderstr, optional
Google Cloud Platform bucket folder, by default “”
- gee_folderstr, optional
Google Earth Engine asset folder, by default “”
- time_startstr, optional
YYYY-MM-DD, by default “”
- epsgstr, optional
EPSG CRS code, by default “4326”
- descriptionstr, optional
GEE asset description text, by default “”
- citationstr, optional
GEE asset citation text, appended to description, by default “”
- gcp_uploadbool, optional
Set False to skip GCP uploading, by default True
- gee_uploadbool, optional
Set False to skip GEE uploading, by default True
- dry_runbool, optional
Set True to skip GCP and GEE uploading, by default False
- gee_forcebool, optional
Set True to overwrite any existing GEE assets, by default False
- Returns:
- NoneType
None
- Warns:
- RuntimeWarning
Examples
from rabpro import utils utils.upload_gee_tif_asset("my.tif", "my_gee_user", "my_gcp_bucket")
- rabpro.utils.upload_gee_vector_asset(zip_path, gee_user, gcp_bucket, gcp_folder='', gcp_upload=True, gee_upload=True)
Upload a zipped shapefile as a GEE vector asset
- Parameters:
- zip_pathstr
Path to zipped shapefile.
- gee_userstr
Google Earth Engine user name.
- gcp_bucketstr
Google Cloud Platform bucket url (e.g. gs://my_bucket_name).
- gcp_folderstr, optional
Google Cloud Platform bucket folder, by default “”
- gcp_uploadbool, optional
Set False to skip GCP uploading, by default True
- gee_uploadbool, optional
Set False to skip GEE uploading, by default True
- Returns:
- str
GEE asset path
- Raises:
- RuntimeError
Throws error if gsutil is not installed or authenticated.
Examples
from rabpro import utils utils.upload_gee_vector_asset("test.zip", "my_gee_user", "my_gcp_bucket")
- rabpro.utils.validify_polygons(polys)
Hacky ways to validify a (multi)polygon. If can’t be validified, returns the original.
- Parameters:
- geomlist
List of shapely.geometry.Polygon
- Returns:
- geomsvlist
List of shapely.geometry.Polygon that have been attempted to validify.
- rabpro.utils.xy_to_coords(xs, ys, gt)
Transforms a set of x and y coordinates to their corresponding coordinates within a geotiff image.
- Parameters:
- xsnumpy.ndarray
x coordinates to transform
- ysnumpy.ndarray
y coordinates to transform
- gttuple
6-element tuple gdal GeoTransform. (uL_x, x_res, rotation, ul_y, rotation, y_res). Automatically created by gdal’s GetGeoTransform() method.
- Returns:
- cx, cytuple of ints
Column and row indices of the provided coordinates.