A new fish-based index of biotic integrity for lowland rivers in Flanders (Belgium)

The first fish-based index to assess the ecological quality of lowland rivers in Flanders (Belgium) is based on data obtained from different fishing techniques without considering the gear specificity. As a consequence, this index could not be intercalibrated with other European indices which concentrate on one gear type only. In order to comply with the European Water Framework Directive, we developed a new fish-based index using data obtained from surveys in rivers with electric gear only. All 293 selected rivers belonged to the bream or barbel zone. An updated reference list of fish species was compiled based on previous work and recent data. Abiotic data were collected according to standard methods and habitat quality of all surveyed sites was pre-classified using pressure indicators. To develop the new index candidate metrics were selected from the literature and metric values were calculated. Linear mixed regression models selected metrics based on their response to the pre-classified habitat status. Correlation tests were performed to avoid redundancy among responsive metrics. Boundaries for metric scores were defined based on the calculated metric values. The new index of biotic integrity (IBI) was calculated by summation of the metric scores, and transformed to an ecological quality ratio (EQR), ranging between 0 and 1. Five integrity classes, ranging from bad to maximal ecological potential, were attributed and compared to the pre-classified habitat status of the site. In addition, the new index was also validated with an independent set of data. The new IBI proved to successfully assess the ecological status of the rivers.


Introduction
Fish are sensitive to changes in habitat quality in rivers (Karr 1981;Fausch et al. 1990;Minier et al. 2015). Spatial and temporal variability of the environment influences physiological and morphological characteristics of fish and affects their behaviour (Townsend & hildrew 1994). Fish populations in many European rivers are impacted by deteriorated habitat and water quality. With the European Water Framework Directive (WFD) the European Union established a framework for the protection, recovery, and rehabilitation of all EU waters. The common goal of the WFD is to achieve and maintain "good ecological status" for all aquatic habitats. The WFD instructs using phytoplankton, macrophytes, benthic invertebrates and fish as biological quality elements to assess the ecological status of rivers (EU waTer FraMeworK direcTiVe 2000). As a consequence, fish-based indices were developed throughout Europe to assess the ecological quality of rivers (e.g., Belpaire et al. 2000;KesMinas & VirBicKas 2000;KesTeMonT et al. 2000;angerMeier & daVideanu 2004;ponT et al. 2006). Fish-based indices are calculated using multiple metrics (Karr 1981). Each of these metrics is a variable assessing an ecological attribute of a fish community that is sensitive to human impact and reacts unambiguously to impact changes .
In Flanders (Belgium) a fish-based index for rivers was developed by Belpaire et al. (2000) and subsequently used to report river habitat quality following the WFD requirements. Intensive application of this index has revealed some shortcomings and the need for improvement. First the robustness of this index is questioned as it is based on data obtained using electric fishing, seine netting, fyke nets and gill nets without considering gear specificity. Different sampling methods give selective or biased estimates of fish species abundance, distribution and size structure (laponTe et al. 2006;JurVelius et al. 2011;han et al. 2019). pasquaud et al. (2012) evaluated monitoring results from four different fishing techniques in the Gironde River, demonstrating that sampling gear had a significant effect on ecological assessments. Moreover, fish responses to particular anthropogenic pressures are gear type specific, hence indices should be tailored to the gear used in the surveys (chow-Fraser et al. 2006). Second, new robust statistical approaches are now available and have been demonstrated useful for the development of new fish-based indices (e.g., MosTaFaVi et al. 2019;dézerald et al. 2020). Third, the European fish index, which was developed as an alternative for existing local indices (ponT et al. 2006), is not a valid alternative because it does not assess Flandrian rivers correctly as it is based on references that do not occur in Flanders (quaTaerT et al. 2007).
We hypothesise that a more robust fish-based index for rivers in Flanders (the northern part of Belgium) can be designed and validated, based on single fish gear data and a reliable statistical approach that allows to reduce the use of expert judgement for assessment. Only fish data from lowland rivers belonging to the bream and barbel zones (hueT 1949), obtained with electric gear in accordance with the CEN standards (CEN 2003) were used. Our study took advantage of an extensive set of data collected over more than 20 years during 1545 fish stock surveys in 675 sites in Flanders. The different steps in the index development are those described by hering et al. (2006).

Fish data and sampling method
Two data sets were established: one to develop the index (development data set) and one for external validation of the index (validation data set). For the development data set surveys between 1993 and 2015 were used: 722 surveys from 337 sampling sites in bream type rivers and 823 surveys from 348 sampling sites of barbel type rivers. These sites are spread over 281 rivers (Fig. 1). Surveys between 2015 and 2018 were used for the validation data set: 62 surveys in bream type rivers and 108 in barbel type rivers. This data set comprises 86 rivers (validation data set). All rivers were characterised as bream or barbel zone based on riverbed slope (measured on maps) and cross section (measured in the field) according to the hueT (1949,1962) typology. River width varies between 0.25 and 30 m and the slope is below 3‰. All rivers have suffered from human impacts such as dam construction, canalisation, agricultural activities or urbanisation.
Sampling was done year round except for winter. Season was defined according to the meteorological definition: spring includes March, April, and May; summer includes June, July, and August; fall includes September, October, and November. Fish assemblages' data was collected by electric fishing. Trained fish biologists and technicians assessed the rivers either by wading or from a boat. In rivers less than 1 meter wide only one anode was used, for larger rivers two hand-held anodes were used. A 5kW generator with adjustable output voltage generated between 300 and 500V and a pulse frequency of 480Hz. When wading, electric fishing was carried out over a standard distance of 100 m in upstream direction. In rivers less than 8 m wide the total width was surveyed, in larger rivers we monitored 2 m along both banks. From a boat a section of 250 m (2.5 m along both banks) was surveyed. The surveyed river bank length occasionally varied due to obstructions or other disturbances. Each captured individual fish was identified, measured (total length (TL), nearest 0.1 cm) and weighed (nearest 0.1g). To correct for differences in sampling effort, survey results were expressed as catch per unit effort (CPUE), i.e., survey data (numbers of individuals and biomass) were standardised to catch results per m². Data are available from the Fish Information Database (VIS databank: http://vis.inbo.be) and in Brosens et al. (2015) and Van Thuyne et al. (2020a).
To assess the relevance of separate indices for the bream and barbel zones we analysed differences between the fish assemblages in these two river types. A Detrended Correspondence Analysis (DCA) using the relative percentage of the 20 most abundant species was performed.

Habitat characteristic
Water quality data recorded in the field with digital multimeters were dissolved oxygen (mg/l), pH, water temperature (°C), conductivity (µS/cm) (Hach HQ40d) and turbidity (NTU, Hach 2100Q). The river width (m) was calculated as the average of three measurements with a tape measure above the water surface at the sampling site. To assess the level of anthropogenic disturbance several variables were recorded and (semi)quantified in an area up to 50 m inland and over a stretch of 250 m up-and downstream along each sampling site (see Pre-classification of habitat quality status).

Index development
Index development was stepwise (Fig. 2), starting with pre-classification of the sites based on a habitat quality assessment. A reference list of fish species attributed to ecological guilds was set up and used for the calculation of candidate metric values. Application of several statistical methods enabled the final selection of metrics. Selected metrics are scored using the development data set, and the Ecological Quality Ratio (EQR) is calculated defining integrity classes. Finally, the validation data set enabled us to evaluate the new index on its performance.

Pre-classification of habitat quality status
The Habitat Status (HabStat) was defined using the habitat characteristics recorded in the field or via Google Earth in those cases were data were incomplete (ergönül et al. 2019). The variables included in the pre-classification are: shelter for fish (e.g., tree trunks, roots, plants), natural banks or reinforced (visual observation of concrete, stones, etc.), industry (industrial activities, e.g., factories), agriculture activity (e.g., crop culture), urbanisation (constructions or buildings) and % trees obtained by visual estimation. Meadows as a proxy of cattle, manure etc., river course alteration (e.g., canalization) and presence of barriers for fish passage upstream and downstream. Threshold and scores are given in Table 1. The sum of the score of variables (SSV) ranged between 0 and 20 and was transformed for standardisation into a habitat status score (HabStat) ranging between 0 and 4 as follow: HabStat=0 for SSV=0; HabStat=1 for SSV=1-5; HabStat=2 for SVV=6-10; HabStat=3 for SSV=11-15 and: HabStat=4 for SSV=16-20. Belpaire et al. (2000) provided a list of 40 fish species occurring in lowland rivers in Flanders. We updated this list and differed between Good and Maximal Ecological Potential (GEP and MEP) status. Fish species were omitted 1) that were locally or regionally extirpated (see VerreycKen et al. 2014), 2) for which the bream or barbel zone as described by hueT (1950, 1962) is not the preferred habitat (see also Breine et al. 2015) and, 3) species defined as non-indigenous according to VerreycKen et al. (2007) based on historical and archaeological records. Exceptions in our list are pike-perch (Sander lucioperca Linnaeus, 1758) and common carp (Cyprinus carpio Linnaeus, 1758) as we consider them as naturalised. We added species that were frequently caught and not yet included. Species that are rarely caught (i.e., <5% catch frequency), are retained in the list representing the MEP, but not in the GEP list .

Reference fish species and attribution to ecological guilds
All species on the final GEP and MEP lists were attributed to metrics referring to ecological guilds based on a literature review (Breine et al. 2004(Breine et al. , 2005. The guilds refer to habitat specificity, reproduction, tolerance to oxygen deficiency and habitat degradation, migration and feeding strategies.  Thresholds and scores for variables to define the pre-classified habitat status. Candidate metrics Ecologically-relevant candidate metrics (e.g., number of species, percentage of specialised spawners) were selected from literature (see Belpaire et al. 2000;Breine et al. 2015). These candidate metrics reflect major fish community attributes such as species richness, trophic function etc. Metric values were calculated using reference species only ) except for the Shannon-Wiener index which was calculated using all species present at the site and the metric assessing pioneers including three non-native species.

Metric selection
To explore the response of the candidate metrics to environmental pressures graphically boxplots were used to show how the metric distribution changes along the HabStat score ). Then the response of candidate metrics to predictors (i.e., pH, water temperature, dissolved oxygen, turbidity, river width, season and HabStat) was analysed with linear mixed regression models as described in ergönül et al. (2018). To retrieve less-skewed distribution, percentage metrics were square-root transformed and count metrics were log-transformed (logx+1) (launois et al. 2011). Diversity metrics (e.g., Shannon-Wiener Index) were not transformed. Only uncorrelated predictors were used in the model. Correlation analysis was done with bivariate Pearson correlation tests (zuur et al. 2009). In addition, collinearity analyses were performed with a Variance Inflation Factor (VIF) analysis as follows: VIF(lm(log(variable+1)~log(variable 1 +1)+log(variable 2 +1)+…+season). The predictors with a VIF value higher than 3 were excluded from modelling (zuur et al. 2009). Secondly, the remaining predictors were rejected if they were highly intercorrelated (c ≥0.7; p < 0.001) (TaBachnicK & Fidell 1996;MosTaFaVi et al. 2019). Sampling site was added as a crossed random effect and sampling date (year) was added as nested random effect. Season (spring, summer or autumn) was added as a factor. Normality assumptions were assessed with the residual plots (Q-Q plots).
To define the goodness-of-fit, the marginal and conditional R² values for each fitted model were calculated as described by naKagawa & schielzeTh (2013). Only the metric response to pressures was decisive for the selection (R² conditional>35%). Four classes of river width were defined by graphical analyses with smoother of number of species in function of river width. The ascending part was divided in three equal parts and the threshold for the next part is defined when the smoother reaches its plateau. Boxplots visualised patterns of the variation with river width classes of the value of metrics fitted with R² conditional >35%. Redundancy of responsive metrics was analysed with a Pearson correlation. Among the correlated metrics (c ≥0.7; p ≤0.001), the one with the best fitted model was chosen. For the less correlated metrics that were fitted by the model (c <0.7 and ≥0.5; p ≤0.05), the one that least correlated with other metrics was selected. The fact that a certain metric contains valuable ecological information required by the WFD can overrule the decision when selection criteria mentioned above were not fulfilled.

Scoring selected metrics and EQR
Fish samples should contain a minimum number of individuals for a consistent estimation of the assemblage structure (logez & ponT 2011). We based the definition of this minimum on the average number of individuals caught in sites with HabStat 1. To correct for catch inefficiency 90% of this average value is used as a benchmark (Breine et al. 2005;Van lieFFeringhe et al. 2010). Depending on the selected metric, river width or season is taken into account for scoring. Based on the reference list, a number of species is defined for each selected metric counting species, representing the values expected in the GEP status. To correct for catch inefficiency we take arbitrarily 70% of the calculated GEP value as the minimum threshold value for metric score 4. Threshold value determination for the selected metrics in other classes (bad, poor or moderate) was based on a combination of its average value in a particular river width class or season and trisection of this value (Karr 1981). This average reflects the lower boundary score for the moderate status (score 3) as most of the sites of this type of rivers in Flanders are considered to be in the moderate status (Van Thuyne et al. 2020b). MEP thresholds (score 5) could be attributed by using the calculated GEP value as the lower score. When the GEP threshold value for a particular metric could not be calculated from the reference (e.g., percentage metrics), it was defined by trisection of the average value in a particular river width class or season. If a percentage metric increases in value with increasing pressure, a minimum value is defined by trisection percentage below which again a low score is given. If a percentage metric decreases in value with increasing pressure, a maximum value is defined by trisection percentage except for metrics assessing recruitment. The detailed scoring procedure of the selected metrics is given in appendix (scoring selected metrics).
For each sampling site total IBI was calculated as the sum of the scores of the selected metrics. To be compatible with the WFD, the IBI values are then transformed to EQR values. The EQR is a numerical value ranging between 0 (bad status) and 1 (MEP) and is calculated with the following formulae: EQR= (IBI-lowest IBI possible)/(maximum IBI possible-lowest IBI possible) (hering et al. 2006). Integrity classes (e.g., bad, poor, moderate, GEP and MEP) were defined as described in Breine et al. (2015).

Validation
Allowing a class difference of one unit (small Type I or II error see Breine et al. 2007Breine et al. , 2010, the index was validated by comparing the integrity class obtained per site with its HabStat status. We assessed data of sites used for the index development and an independent set of data. EQR scores were graphically screened with HabStat scores using both data sets. In addition a Spearman's rank test (2-tailed) assessed the correlations among the selected metric-scores and the EQR.

Fish assemblage
A DCA with the relative numbers of the 20 most abundant species shows a big overlap between the barbel and bream zone assemblages (Fig. 3). Because there is no clear difference, we group all data together for further analyses and only one common index for both river types will be developed.
Between1993 and 2015 45 fish species were caught 10 of which are non-indigenous. In total 216 062 individuals were caught or 0.3 individuals per surveyed m². In both bream and barbel zones threespined stickleback (Gasterosteus aculeatus Linnaeus, 1758) is the most common species followed by gudgeon (Gobio gobio Linnaeus, 1758), roach (Rutilus rutilus Linnaeus, 1758) and ninespine stickleback (Pungitius pungitius Linnaeus, 1758). Only 14 native species have a catch frequency > 10% ( Table 2). The maximum number of native species encountered in a single survey is 17 which occurred in 2 sites only (0.13%). Barely 9.7% of the surveyed sites had 10 or more species and in more than half of the stations (54.2%) 4 or less species were caught.

Water quality and HabStat
Water quality parameters in the bream and barbel type of sites show a higher within than between variation (Table 3). In 21.9% surveys of the bream sites and 10.5% surveys of the barbel sites dissolved oxygen was lower than 6 mg/l at the time of the survey. This is the minimum oxygen concentration required for sustainable fish life in rivers as stipulated by Belgian law (VlareM II 2010). In the sites belonging to the bream zone 18.3% have a HabStat of 1, 41.3% score 2, 38.5% score 3 and only 1.9% get a 4. In the barbel zone 15.3% score 1, 34.9% scores 2, 24.5% score 3 and 25.3% score 4. HabStat 1-4 (2.2 ± 0.7) 1-4 (2.1 ± 0.7)

TABLE 3
The range (min-max) of the abiotic data collected during each survey including water quality parameters, river width and HabStat (including mean ± standard deviation) in rivers of bream and barbel zone (n=number of sites).

Reference fish species and attribution to ecological guilds
The updated reference fish species list for rivers in Flanders includes 37 species (Table 2). Nonindigenous species and the estuarine common goby (Pomatoschistus microps, Krøyer, 1838) are not considered as MEP or GEP species for rivers. Eighteen species from the list were rarely caught (<5%) and are considered as MEP species or GEP can still be reached when they are absent. The remaining 19 species contribute to the GEP status. The metric attribution is also presented in Table 2. The metrics belong to habitat use guild (diadromic and benthic species), trophic guilds (piscivores and invertivores); reproductive guilds, e.g., specialised spawners (MnsSpa) including gravel spawners, phytophilic and ostracophilic spawners; habitat use guilds, e.g., benthic and phytophilic species and tolerance guilds, e.g., tolerant and intolerant species and pioneers species which are defined as opportunistic resistant species that are first to be found once a habitat is restored (MnsPio). Pioneers include three non-native species: Prussian carp (Carassius gibelio Bloch 1782), stone moroko (Pseudorasbora parva Temminck & Schlegel, 1846) and pumpkinseed (Lepomis gibbosus Linnaeus 1758).

Selection of metrics
The metric screening from literature resulted in a total of 44 candidate metrics (Table 4).
The measure of association analyses allowed the selection of uncorrelated variables to be used in the model. The correlations among the abiotic variables are small (Table A1, Appendix). From the variance inflating factor analyses we only drop dissolved oxygen (VIF»3 in Table A2 in Appendix). Water temperature is kept as its VIF is just slightly above 3 and is known as an important variable for fish (e.g., weBB & walsh 2004; walBerg 2011).
Except for MniInd/m² (number of individuals per m²), data variation in metrics based on number of individuals did not show a clear pattern in the different HabStat classes and were therefore omitted. This reduced the number of candidate metrics to be tested with linear mixed models to 30 (Table A3, Appendix).
Normality analyses excluded following fitted metrics: invertivorous species (MnsInv), relative percentage piscivorous individuals (MpiPis), number of diadromic and rheophilic species (MnsDia, MnsRhe), biomass non-indigenous fish (BioExo), number of benthic species (MnsBen), total biomass (ManBio), number of phytophilic species (MnsPhy), relative percentage pioneers (MpiPio) and number of tolerant species (MnsTol). The metrics assessing the number of non-indigenous and pioneer species (MnsExo, MnsPio) are removed as they show an opposite reaction to habitat quality improvement as expected. The correlation between the fitted metrics is given in Table A4 (Appendix).
We categorised river width as follows: class 1: ≤ 5m (n=1057); class 2: >5 and ≤10m (n=350); class 3: >10 and ≤ 15m (n=108) and class 4 >15 and ≤30m (n=82 The selected metrics show a gradient with HabStat classes (Fig. 4). Other fitted metrics also showed a gradient but were omitted as they were highly correlated with other metrics (Table A4) or had a small range of variation (number of invertivorous species MnsInv 0-4). The less fitted MpsRec is kept as it is indirectly a metric required by the WFD. It shows a relation with HabStat but is independent from river width (Figs 4-5) and is not correlated with any selected metric. The model shows a seasonal effect on recruitment (Table A3). In a next step the 5 selected metrics needed scoring.  Candidate metrics with their predicted response to increasing disturbance.

Metric scoring
The average number of individuals caught in sites with HabStat 1 was 34. Hence the required number of individuals in a sample for a metric calculation was minimum 30. The rationale of metric scoring is explained for metric MnsTot. The scoring procedure for the other metrics is similar and explained in appendix 'Scoring selected metric'. Values of the metric percentage of recruiting species (MpsRec) are season dependent (Fig. 6) and scoring criteria are therefore related to season.

Total number of species (MnsTot)
Seventeen was the maximum number of reference species caught at one site in the studied period, while 19 is the number of species in GEP ( Table 2). The lower limit of score 4 (GEP) is determined by 70% of 19 species (i.e., 13). Scoring depends on the river width as this metric is affected by it (Fig. 5).
The average metric value of MnsTot in each river width class (see table A5 in Appendix) is the lower boundary for the moderate metric score which is 3. The lower score of GEP determines automatically the upper boundary for the moderate score. Trisection gives the lower boundaries for the metric scores 2 and 1. For example in the river width class ≤5m the average is 4 species. Therefore, 4 is the lower boundary for score 3, i.e., the moderate metric score. The upper boundary for this score is 12 as 13 is the lower boundary for score 4 (GEP). To obtain the lower boundary for the poor class (score 2) we apply the trisection as follows: 4/3*2= 3; for the bad class (score 1): 4/3= 1. We apply the same approach for the other river width classes to obtain the upper boundaries of the metric scores 1 and 2. Values lower than the lower value of score 1 are scored as 0. Boundary values for score classes are given in Table 5.

IBI scoring thresholds and EQR
The maximum IBI score with selected metrics is 25 and the minimum is 1. This minimum value is possible when MpiTol scores 1 and the other metrics score zero. The index score was transformed to an EQR calculated as a value between 0 and 1. The appreciation of the status is defined by the EQR value (Table 6).

Validation
The internal validation with development data set was performed using data of 963 surveys in the period 1993-2015 as in 582 surveys less than 30 fish individuals were caught.    (Belpaire et al. 2000) and the new EQR values showed a low but significant correlation (c=0.562; p<0.001). In the external validation 6 sites out of 131 are scored too high (Type II error) with the old index while with the new index one site was scored too low (Type I error). Setting out the variation of EQR scores in function of HabStat scores shows a very good gradient with the internal data but with the external data the good HabStat (score 4) is not distinguished from the moderate sites (Fig. 7). All metric scores, except MpsRec with the internal validation data, were significantly positively correlated with the EQR (Table 7).

Discussion
In our study all necessary steps to design a new fish-based index of biotic integrity (Fig. 2) were successfully taken.
Fish data were obtained with a single standardised fishing methodology according to CEN standards (CEN 2003), reducing the effect of the fish community's gear specific reactions to disturbances.
Pre-classification of habitat status was used as a technique to rank sampling sites in a reasonable way with respect to anthropogenic pressures, enabling to establish the fish-based metrics' reaction to multiple stressors (see also in Breine et al. 2007Breine et al. , 2010quaTaerT et al. 2011;wu et al. 2014;Breine et al. 2015;ergönül et al. 2018;schinegger et al. 2018;conino et al. 2020). Different variables were chosen for their known effects on the fish community at different spatio-temporal scales (e.g.,   Spearman's rho correlation between the metrics and EQR, ** correlation is significant at the 0.001 level (2-tailed) (For abbreviations, see Table 4). Pre-classification threshold values were attributed with expert judgement. The rationale was to obtain a good ranking of habitats with respect to human impact and not necessarily an absolute expression of the habitat quality. The ranking was needed for the first metric selection and for validation. Sometimes HabStat classes 3 and 4 are not clearly distinguished by fitted metrics, probably because there is a disproportion of HabStat class 4 (17) and HabStat 3 (566) sites in our data set.

MnsTot
A reference list of fish species for pristine rivers could not be established because all lowland rivers in Flanders are heavily modified. Instead, we developed a list representing the GEP status and one for the MEP status in accordance with the WFD guidelines. These differ slightly from the list presented by Belpaire et al. (2000) which was based on hueT (1949) and VandelannooTe et al. (1998). We added flounder (Platichthys flesus Linnaeus, 1758), river lamprey (Lampetra fluviatilis Linnaeus, 1758), Atlantic salmon (Salmo salar Linnaeus, 1758) and Wels catfish (Silurus glanis Linnaeus, 1758). These species were frequently caught except for salmon. But the latter species has the potential to re-enter the rivers as it is found in the Belgian part of the River Maas. All non-indigenous species were omitted from the list because they are considered as indicators of disturbance (e.g., Karr 1981;Belpaire et al. 2000). dézerald et al. (2020) could detect impairment of river fish communities induced by nonindigenous species in 1527 sites belonging to the French Water Framework Directive's surveillance monitoring. However, their effects on the native communities could not be defined and schlaepFer (2018) argued that biodiversity indices should include non-indigenous species. Thus, we excluded nonindigenous species from all metrics except Shannon-Wiener Index (ManSha) and metrics assessing alien species. Species residing for decades in our waters, e.g., pike-perch, are considered as naturalised and are included in the reference list. We consider the reference list as the pool containing possible fish species in the rivers of bream and barbel type in Flanders.
For the metric selection we applied a repeatedly used methodology (oBerdorFF et al. 2002;ponT et al. 2007;Freund & peTTy 2007;Breine et al. 2004Breine et al. , 2015MosTaFaVi et al. 2015;ergönül et al. 2019), but with minor modifications. Dissolved oxygen, albeit an important water quality parameter (e.g., aziMi & rocher 2016), was omitted as predictor from the model using the Akaike information criterion (AIC). Linear mixed modelling is considered as a reliable tool for ecological studies (einheuser et al. 2013) and allowed for a first metric selection based on the sensitivity for one or more predictors.
The new index includes three metrics associated with guilds (tolerance, recruitment and a trophic guild) and two associated with diversity (the total number of native species (MnsTot) and Shannon-Wiener index (ManSha)) all of which have already been used in different other fish-based indices. The number of native species decreases with increased human impact and is a widely used metric in fish-based indices (e.g., Breine et al. 2004;MosTaFaVi et al. 2019). Non-indigenous species could indeed mask the response to habitat degradation as they are mostly tolerant (casaTTi et al. 2009;casaTTi & Teresa 2012).
ManSha includes all species but integrates their evenness and is often used in fish-based indices (wu et al. 2014; yerli et al. 2016; ergönül et al. 2019). The value of this index usually ranges between 1.5 and 3.5 and rarely exceeds 4.5 (Magurran 2004) and its highest value in our data set was 2.38.
The relative percentage of tolerant individuals (MpiTol) is the only selected metric increasing in value with decreasing habitat quality. It is logical to define a minimum percentage below which again a low score is obtained as a total absence of tolerant species is not natural. Their observed trend in our study is in accordance with observations made in other studies (oBerdorFF et al. 2002;Vehanen et al. 2010;casaTTi & Teresa 2012;li et al. 2018).
The relative percentage of invertivorous individuals (MpiInv) is a metric related to a trophic guild. Aquatic habitat degradation may have an adverse effect on fish with specific feeding requirements such as invertivorous fish (noBle et al. 2007). A decrease in this metric also indicates the alterations in the aquatic food chain (ergönül et al. 2019).
Percentage recruitment (MpsRec) contains information about the fish condition and habitat use. Fish should be able to recruit naturally. The metric 'percentage recruitment' was retained because we consider that even a less optimal metric sometimes can provide valuable information in combination with other metrics (see also Breine et al. 2007). In addition, this metric is based on length classes present per species and therefore complies with one of the WFD requirements. The scoring of this metric, which is also used in the initial index, is now taking into account seasonal effects.
Metric scoring is often based on comparison with reference sites (e.g., Karr 1981;oBerdorFF et al. 2002;casaTTi et al. 2009;Vehanen et al. 201;adaMczyK et al. 2017). However, the use of reference conditions as a benchmark for metric scoring has a number of limitations. It could be an unrealistic target for management (Tweedley et al. 2017). Moreover, global warming can change the environmental conditions in an unpredictable way and can modify the structure of pristine aquatic communities (laTli et al. 2017). During the European fish-based intercalibration exercise it became clear that many countries lack pristine sites and had to use the best available sites as reference (Jepsen & ponT 2007(Jepsen & ponT ). pardo et al. (2012 developed a protocol to identify near pristine sites in the Central and Baltic region of Europe. In Flanders no pristine sites or even natural rivers are present. In the current study a data driven methodology was used to score the selected metrics. Different authors applied similar techniques to create a reference based on data (BozzeTTi & schulz 2004;paganelli et al. 2011;BorJa et al. 2012;sapounidis et al. 2019). Values for the selected metrics changed with river width or season. Therefore, scoring was adjusted to these variables but always the same rationale was used to define scoring criteria. Using the average metric value within a particular river width class or season assures a realistic and robust approach. Attempts to estimate the slope of metric values, under which trisection is performed, by applying trendline regression (e.g., Belpaire et al. 2000) suffer from the potential inadequacy of overestimating the expected condition for metrics that vary along natural environmental gradients (Kennard et al. 2006).
For validation we allowed a one-class difference between the habitat status (HabStat pre-classification) and the EQR to define tendencies, of under-or over-estimation, as was done by other authors (goFFaux et al. 2001;Breine et al. 2015;ergönül et al. 2019). Internal IBI validation with boxplots as an initial pertinence test for the developed index revealed a good correlation for EQR scores HabStat. With the validation dataset good HabStat was not distinguished from moderate but only two good HabStat sites (score 4) were represented. One metric (MpsRec) was not correlated (r < 0.5, p > 0.01, Table 7) with the EQR for internal data. However, with the external data the correlation was small but still significant. Nevertheless, this metric was retained because it is required by the WFD as a proxy of age structure. Overall, the newly developed index was able to distinguish different degrees of degradation within the pre-classified sites.

Conclusion
The result of our study is a validated method to assess the ecological status of Flandrian lowland rivers, using a single standardised fishing methodology according to CEN standards (CEN, 2003) and with limited reliance on expert judgement compared to Belpaire et al. (2000). The developed IBI complies adequately to the criteria stipulated in the Water Framework Directive. We ensured that the selected metrics are not redundant and are relevant allowing for appropriate assessment of anthropogenic impacts on the fish communities. Moreover, we demonstrated that fish communities in barbel and in bream type of rivers can be assessed using the same IBI.

TABLE A4
Pearson coefficient (c) and significance (**p ≤0.001; * p≤0.05) for correlation analysis fitted metrics log (L) or square root (S) transformed (abbreviations, see Table 4).  Average and maximum values of selected metrics in function of river width classes (abbreviations, see Table 4).