Technical notes

These Technical Notes cover:

  • Rating curves and equations
  • Indicative Suitability
  • History - station types and datums
  • Missing data
  • Annual maxima for Incomplete years
  • Peaks over threshold data

In addition, the Glossary explains particular terms under these headings:

  • General terms used in the Flood Estimation Handbook
  • FEH catchment descriptors
  • Other FEH terms
  • Hydrological Calendar
  • AMAX and POT pages
  • Codes used for station types

Rating curves and equations

Rating curves and equations relate the stage (water level) measured at a station to the flow (discharge).  They are also known as stage-discharge equations.

Rating curves in HiFlows-UK are expressed as:

Q = K(h+a)^p    

where Q is the discharge, h is the gauged stage, a is the stage at zero flow (datum correction), and K and p are constants.  This is equivalent to Q = C(h+a)^n  in the Flood Estimation Handbook (Volume 3 page 274).

This is the format of all rating curves supplied to the project.  The equation is presented as it is read and used.  Users should note the difference with the recent ISO which uses the same general form of the equation, but with -a.

Users should be aware that the stage-discharge equations applied within HiFlows-UK are not always the same as those applied in the main hydrometric archives of the Environment Agency, Scottish Environment Protection Agency and Rivers Agency.  The equations applied in HiFlows-UK have been considered to be the most appropriate for derivation of a flood flow series, and they may not be appropriate for other purposes (such as low flows).  Ratings may be changed and improved as more data is collected or more detailed analyses carried out, and these improvements may be applied retrospectively.  In the longer term, it is hoped to consolidate to a more consistent series of rating equations in the various archives.

Indicative Suitability

The "Indicative Suitability" relates solely to the perceived quality of flow data at the station. It does not take into account the relevance of catchment descriptors in selecting stations - in particular, nothing about urbanisation (URBEXT) or lake or reservoir effects (FARL). Broad guidance in the FEH is not to use stations for transfer of data (such as from an analogue station for QMED) or pooling if URBEXT is more than 0.025 or FARL is less than 0.95. The actual choice of stations as an analogue or donor station, or for a pooling group, may also depend on the number of suitable stations available for selection, and hydrologists may have to balance similarity of catchments and flow statistics against quality of flow data. Hence, these comments are indicative, to help hydrologists make their own decisions.

The general criteria for "Indicative suitability" for QMED has been to accept the station if QMED is likely to be within 30% of its true value.

The general criteria for "Indicative suitability" for pooling has been to accept the station for pooling if either the third highest annual maximum (AMAX) flow or the 8 year event is likely to be within 30% of its true value.

History - station types and datums

The History page is a short table of which summarises the station types and datums over the period of record, with dates of applicability.

Missing data

Users may note instances when the AMAX record has no data for a particular water year, but there is no corresponding entry in the Missing Data table.

The section "Annual maxima for incomplete years" (below) describes how gaps in the record are identified and handled for the AMAX series.

The Missing Data page for each station lists the dates of all clearly identified periods of missing data and the length of the gap in whole days.  However, in the pre-digital period it is not always clear when stations have gaps in the record, and any such undefined gaps are therefore not listed.

Annual maxima for incomplete years

Incomplete or partial years (when there are gaps in the record) present a difficulty for a largely automated process such as has been applied in HiFlows-UK where other information are not available.

HiFlows-UK shows the highest recorded values in each year, but indicates where it is most likely that the value may not be the highest which occurred in the year.  These values are marked yellow in the table and plot on the AMAX pages, and are listed as rejected years in the .am file which is downloaded for use in WINFAP-FEH. By default, WINFAP-FEH will exclude from the analysis all years listed as "rejected", though the user can choose to include them.

Ideally, where there is only a partial record in a year, the AMAX and POT values should be compared with a nearby station on the same river. Rainfall records and other stations can also be used. Within HiFlows-UK, this type of information was used where already available, or identified in passing during the project.

The process applied in HiFlows-UK to pre-digital data
(where no other information was available)

The process was applied not only to the first year if record, but also to any gaps in the record - to the first AMAX value, to the AMAX value before a missing year, and to the AMAX value after a missing year.

Firstly, missing periods for pre-digital data were identified where more than 25% of the time period was missing. "Missing data records" were not used because these are often unreliable in the early years of a record. Instead, the occurrence of the first or last POT in a series was used as an indicator of the possible incompleteness of a water year.  For example, if the first POT in the first water year did not occur until February then the year would be provisionally flagged as incomplete.

Secondly, these years would be re-included if the AMAX peak was in the top 37% of AMAX values for the station.

The criteria to include flows within the top 37% of the AMAX series is based on the likely effect of the criteria:

a) exclusion of values above QMED will reduce the QMED estimate derived from the AMAX series (which argues for a cut-off at 50%)
b) retention of spurious values just above QMED will tend to lower station growth curves.

The percentage value of 37% was derived pragmatically, after an analysis of the data in all incomplete years, as giving data series which are likely to give reasonable estimates of both QMED and growth curves.

We will welcome feedback from users on specific values where stations have incomplete years.

Peaks over threshold data

POT data sources

The source of each POT value is given in the POT table. Recent years have been extracted directly from digital archives. The main sources for early years are the FEH \ CEH dataset, the Hydpeaks archive in Thames region, and the University of Dundee for POT data for Scotland, while in Northern Ireland POTs have been extracted manually as part of the gauging authority's ongoing procedures.

POT from digital archives

POT values have been extracted from digital archives wherever possible.

The extraction criteria used are broadly those set out in the FEH Volume 3, section 23.5.1 (page 275). The procedure described there is suitable for automatic data extraction followed by inspection to remove any remaining erroneous peaks.  However, for the entirely automated procedure within HiFlows-UK the FEH's independence criteria of the trough between two peaks having to be less than 2/3 of the magnitude of the first of the two peaks was modified such that the trough had to be less than 2/3 of the magnitude of both peaks.  This was done in order to exclude minor blips on the recession limb (such as might occur due to a very small amount of rain at the end of a long recession limb).

POT from non-digital archives

The FEH \ CEH dataset was derived for the Flood Studies Report of 1975.  It is used where other data are not available or are considered less reliable.  Much time and effort was spent in extracting these data and the record is generally considered to be good.  The FEH .pt files only present flow data, which may mean that the whole data series is not consistent where rating curves have been changed significantly by subsequent re-analysis. However, for the majority of stations (but unfortunately not all) the levels were also available in 'paper' format, and this data has been added so that the whole data series can be processed to give a consistent flow record, whatever the changes in rating. The average number of events per year is lower than applied in digital extraction, and this may also mean that there appear to be gaps in the early record when it is in fact continuous.

There may be discrepancies where more than one source has been used to extract POT data, particularly where one source has used Calendar Days and another Water Days, possibly with an unknown time set to 00:00.  This can also lead to duplication at the join between data sources, as shown in the example below. Such duplicate values have been removed where identified.

Date               Time      Stage      Flow      Rating        Source    Ref
01/12/1975    00:00     1.180      14.93     In Range   FSR        1a
02/12/1975    04:30     1.185      15.03     In Range   Mfiche    1a

The Hydpeaks archive in Thames region stores the peak level and flow in each month. Thus, when there are two or more large events in the same month, only the largest is extracted to the POT record.

Post-processing of POT data

Different sources may have extracted POT data by different methods, and a series might have been extracted using different thresholds at different times.  In many cases this gives fewer POT values in each year in the earlier non-digital record than in the digital record.

However, wherever possible the same threshold has subsequently been imposed on the whole dataset.  Therefore, although the digital extraction threshold is typically very low (to get 10 to 12 peaks per year), for the POT (.pt) files and data shown on the web page, the threshold set for the full data series is the highest used in the history of POT extraction at a station, as long as this still gives 3 to 5 peaks per year.  Finally, this was adjusted by eye to give some consistency for the whole period of record.  The resulting data series in the .pt file and the POT webpage should be the same as one extracted at the higher (and consistent) threshold.

The .csv files contain all the POT data extracted (i.e. without the post-processing described above, and hence possibly with varying thresholds).

Stations without POT data
 
Some stations have no POT data or .pt file because they have been considered to be unsuitable for POT extraction. This is usually because the gauged catchment response does not lend itself to POT analysis.  Such rivers are usually dominated by baseflow and are therefore concentrated on the large chalk aquifers of southern and eastern England.