Calculate daily summary statistics
Calculates means, medians, maximums, minimums, and percentiles for each day of the year of flow values
from a daily streamflow data set. Can determine statistics of rolling mean days (e.g. 7-day flows) using the roll_days
argument. Note that statistics are based on the numeric days of year (1-365) and not the date of year (Jan 1 - Dec 31).
Calculates statistics from all values, unless specified. Returns a tibble with statistics.
calc_daily_stats( data, dates = Date, values = Value, groups = STATION_NUMBER, station_number, percentiles = c(5, 25, 75, 95), roll_days = 1, roll_align = "right", water_year_start = 1, start_year, end_year, exclude_years, complete_years = FALSE, months = 1:12, transpose = FALSE, ignore_missing = FALSE )
data |
Data frame of daily data that contains columns of dates, flow values, and (optional) groups (e.g. station numbers).
Leave blank if using |
dates |
Name of column in |
values |
Name of column in |
groups |
Name of column in |
station_number |
Character string vector of seven digit Water Survey of Canada station numbers (e.g. |
percentiles |
Numeric vector of percentiles to calculate. Set to |
roll_days |
Numeric value of the number of days to apply a rolling mean. Default |
roll_align |
Character string identifying the direction of the rolling mean from the specified date, either by the first
( |
water_year_start |
Numeric value indicating the month ( |
start_year |
Numeric value of the first year to consider for analysis. Leave blank to use the first year of the source data. |
end_year |
Numeric value of the last year to consider for analysis. Leave blank to use the last year of the source data. |
exclude_years |
Numeric vector of years to exclude from analysis. Leave blank to include all years. |
complete_years |
Logical values indicating whether to include only years with complete data in analysis. Default |
months |
Numeric vector of months to include in analysis (e.g. |
transpose |
Logical value indicating whether to transpose rows and columns of results. Default |
ignore_missing |
Logical value indicating whether dates with missing values should be included in the calculation. If
|
A tibble data frame with the following columns:
Date |
date (MMM-DD) of daily statistics |
DayofYear |
day of year of daily statistics |
Mean |
daily mean of all flows for a given day of the year |
Median |
daily mean of all flows for a given day of the year |
Maximum |
daily mean of all flows for a given day of the year |
Minimum |
daily mean of all flows for a given day of the year |
P'n' |
each daily n-th percentile selected of all flows for a given day of the year |
Default percentile columns:
P5 |
daily 5th percentile of all flows for a given day of the year |
P25 |
daily 25th percentile of all flows for a given day of the year |
P75 |
daily 75th percentile of all flows for a given day of the year |
P95 |
daily 95th percentile of all flows for a given day of the year |
Transposing data creates a column of "Statistics" and subsequent columns for each year selected.
# Run if HYDAT database has been downloaded (using tidyhydat::download_hydat()) if (file.exists(tidyhydat::hy_downloaded_db())) { # Calculate daily statistics using data argument with defaults flow_data <- tidyhydat::hy_daily_flows(station_number = "08NM116") calc_daily_stats(data = flow_data, start_year = 1980) # Calculate daily statistics using station_number argument with defaults calc_daily_stats(station_number = "08NM116", start_year = 1980) # Calculate daily statistics regardless if there is missing data for a given day of year calc_daily_stats(station_number = "08NM116", ignore_missing = TRUE) # Calculate daily statistics using only years with no missing data calc_daily_stats(station_number = "08NM116", complete_years = TRUE) # Calculate daily statistics for water years starting in October between 1980 and 2010 calc_daily_stats(station_number = "08NM116", start_year = 1980, end_year = 2010, water_year_start = 10) # Calculate daily statistics with custom years and removing certain years calc_daily_stats(station_number = "08NM116", start_year = 1981, end_year = 2010, exclude_years = c(1991,1993:1995)) # Calculate daily statistics for 7-day flows for July-September months only, # with 25 and 75th percentiles starting in 1980 calc_daily_stats(station_number = "08NM116", start_year = 1980, roll_days = 7, months = 7:9, percentiles = c(25,75)) }
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.