-
PDF
- Split View
-
Views
-
Cite
Cite
Ava Mehdipour, Selina Malouka, Marla Beauchamp, Julie Richardson, Ayse Kuspinar, Measurement properties of the usual and fast gait speed tests in community-dwelling older adults: a COSMIN-based systematic review, Age and Ageing, Volume 53, Issue 3, March 2024, afae055, https://doi.org/10.1093/ageing/afae055
- Share Icon Share
Abstract
The gait speed test is one of the most widely used mobility assessments for older adults. We conducted a systematic review to evaluate and compare the measurement properties of the usual and fast gait speed tests in community-dwelling older adults.
Three databases were searched: MEDLINE, EMBASE and CINAHL. Peer-reviewed articles evaluating the gait speed test’s measurement properties or interpretability in community-dwelling older adults were included. The Consensus-based Standards for the selection of health Measurement Instruments guidelines were followed for data synthesis and quality assessment.
Ninety-five articles met our inclusion criteria, with 79 evaluating a measurement property and 16 reporting on interpretability. There was sufficient reliability for both tests, with intraclass correlation coefficients (ICC) generally ranging from 0.72 to 0.98, but overall quality of evidence was low. For convergent/discriminant validity, an overall sufficient rating with moderate quality of evidence was found for both tests. Concurrent validity of the usual gait speed test was sufficient (ICCs = 0.79–0.93 with longer distances) with moderate quality of evidence; however, there were insufficient results for the fast gait speed test (e.g. low agreement with longer distances) supported by high-quality studies. Responsiveness was only evaluated in three articles, with low quality of evidence.
Findings from this review demonstrated evidence in support of the reliability and validity of the usual and fast gait speed tests in community-dwelling older adults. However, future validation studies should employ rigorous methodology and evaluate the tests’ responsiveness.
Key Points
Low quality of evidence demonstrated sufficient reliability for both usual and fast gait speed.
Moderate quality of evidence demonstrated sufficient convergent/discriminant and concurrent validity for usual gait speed.
Very low to low quality of evidence demonstrated insufficient known-groups and sufficient predictive validity for both speeds.
Low quality of evidence demonstrated sufficient responsiveness for usual pace and insufficient responsiveness for fast pace.
Background
Gait speed is a widely used mobility assessment for community-dwelling older adults, often referred to as ‘the sixth vital sign’ [1]. It has been shown to predict health outcomes, such as mortality, disability and hospitalisation, in older adults [2, 3]. Gait speed is a quick and easy-to-administer test as it does not require any special equipment, whilst being clinically useful [4, 5]. During the test, an assessor times the participant using a stopwatch as they walk a predetermined distance; their speed, in metres per second, is then recorded.
The gait speed test can be administered with different speeds: commonly usual and fast paced [4]. Although both speeds have been reported to correlate well (r = 0.96) [6] and be comparable in terms of predicting survival in older adults [7], there is evidence to suggest that they may have different psychometric properties and clinical utility [8, 9].
The aim of this review was to critically appraise and compare the measurement properties of the usual and fast gait speed tests in community-dwelling older adults. Measurement properties appraised were reliability, to determine whether the tests provide consistent scores given stable conditions; validity, to assess whether gait speed accurately captures the mobility of community-dwelling older adults; and responsiveness, to assess whether gait speed accurately reflects changes in older adults’ mobility. The secondary aim of the review was to summarise interpretability values, such as minimal important change, for the gait speed tests in community-dwelling older adults.
Methods
A systematic review was performed following the Consensus-based Standards for the selection of health Measurement Instruments (COSMIN) guidelines [10] for evaluating and reporting measurement properties (PROSPERO ID CRD42021232169). This manuscript was reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist [11].
Eligibility criteria
The inclusion criteria for studies were as follows: (1) majority of the study sample (i.e. >50% or mean age >65) was representative of the population of interest: community-dwelling older adults, ≥65 years; (2) the outcome measure under study was the gait speed test assessed manually (e.g. via stopwatch); (3) the aim of the study was the evaluation of one or more measurement properties or the interpretability of the gait speed test. Articles were excluded if they were (1) not peer reviewed, (2) not in English, (3) grey literature (e.g. meeting/conference proceedings/abstracts), (4) limited to disease-specific groups, (5) reviews or editorials, or (6) measured gait speed using electronic devices (e.g. GAITRite mat).
Information sources
The following electronic databases were searched: Medline (1946–2022 Dec. 30) and Embase (1974–2022 Dec. 30) through Ovid, and CINAHL (1981–2022 Dec. 30) through EBSCO.
Search strategy
Appendix 1 outlines the complete search strategy for each database. Search terms covered (1) the population: community-dwelling older adults, (2) the gait speed test and (3) measurement properties and interpretability (using the search filter developed by Terwee et al. [12]).
Study selection process
Titles and abstracts were screened independently by two reviewers (AM and SM) using Covidence. Full-text articles were screened independently by the same reviewers. Where consensus could not be reached, disagreements were resolved by a third reviewer (AK).
Data collection process and data items
Characteristics of the gait speed test in each study were extracted, such as the distance walked, speed, and starting protocol. Study characteristics, such as country, setting, sample size, mean age, and percent female, were extracted. Information on the following measurement properties were extracted: reliability, measurement error, construct validity (convergent, discriminant and known-groups), criterion validity (concurrent and predictive) and responsiveness [13, 14]. Interpretability [13] statistics (e.g. cut-off scores and minimal important change) were also extracted. Data extraction was performed, independently, by two reviewers (AM and SM) to ensure all relevant information was captured. Disagreements were resolved through discussion or consultation with a third reviewer (AK).
Study risk-of-bias assessment
The COSMIN risk of bias checklist, which includes separate criteria for each measurement property, was used to assess the methodological quality of each study [15]. Studies were rated as very good, adequate, doubtful or inadequate. This quality assessment was performed by two reviewers (AM and SM) independently and discrepancies were discussed.
Measurement properties
COSMIN’s criteria for good measurement properties were used to rate the result of each measurement property for each study as sufficient, insufficient or indeterminate for meeting a priori hypotheses [10].Appendix 2 outlines hypotheses formulated by the review team. Generally, reliability correlation coefficients, correlations with a gold standard and areas under the curve (AUCs) were expected to be ≥0.70 [10]. For validity, correlations with a walking-based physical measure were expected to be ≥0.50, with a measure assessing a related but dissimilar construct (e.g. function or quality of life) were expected to be ≥0.30 and with a measure assessing an unrelated construct were expected to be <0.30 [10]. The rating for each result was performed independently by two reviewers (AM and SM).
Synthesised results for each measurement property per type of gait speed test were rated against COSMIN’s criteria for good measurement properties [10]. If ≥75% of results were sufficient (or insufficient), then the overall rating was consistent and considered sufficient (or insufficient) [10]. If <75% of results were sufficient (or insufficient), then the overall rating was considered inconsistent and was based on the majority of the ratings [10].
Synthesis methods
Quantitative pooling for each speed (per measurement property) was not performed as studies employed different study designs and/or statistical methodologies [10, 14, 16]. Thus, results were qualitatively summarised where mean ranges (if applicable) and percentage of confirmed hypotheses were reported. Results were synthesised separately for each test speed: usual and fast walking speed.
Certainty assessment
COSMIN’s modified Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach was used to evaluate the quality of evidence for each synthesised result [10, 17]. Grading was performed independently by two reviewers (AM and SM), considering the following three factors: risk of bias, inconsistency and imprecision (Table 1).
Quality of evidence assessment using COSMIN’s modified grading of recommendations assessment, development and evaluation (GRADE) approach [10]
Factors . | Considerations . |
---|---|
Risk of bias | Using COSMIN’s guidelines [10] of serious, very serious and extremely serious |
Inconsistency (only for inconsistent ratings) |
|
Imprecision |
|
Indirectness | This was not considered when performing GRADE as the inclusion criteria for the population was specific (i.e. only studies on community-dwelling older adults, 65 and older were included) and specific hypotheses were created for each comparison in construct validity and responsiveness |
Factors . | Considerations . |
---|---|
Risk of bias | Using COSMIN’s guidelines [10] of serious, very serious and extremely serious |
Inconsistency (only for inconsistent ratings) |
|
Imprecision |
|
Indirectness | This was not considered when performing GRADE as the inclusion criteria for the population was specific (i.e. only studies on community-dwelling older adults, 65 and older were included) and specific hypotheses were created for each comparison in construct validity and responsiveness |
Quality of evidence assessment using COSMIN’s modified grading of recommendations assessment, development and evaluation (GRADE) approach [10]
Factors . | Considerations . |
---|---|
Risk of bias | Using COSMIN’s guidelines [10] of serious, very serious and extremely serious |
Inconsistency (only for inconsistent ratings) |
|
Imprecision |
|
Indirectness | This was not considered when performing GRADE as the inclusion criteria for the population was specific (i.e. only studies on community-dwelling older adults, 65 and older were included) and specific hypotheses were created for each comparison in construct validity and responsiveness |
Factors . | Considerations . |
---|---|
Risk of bias | Using COSMIN’s guidelines [10] of serious, very serious and extremely serious |
Inconsistency (only for inconsistent ratings) |
|
Imprecision |
|
Indirectness | This was not considered when performing GRADE as the inclusion criteria for the population was specific (i.e. only studies on community-dwelling older adults, 65 and older were included) and specific hypotheses were created for each comparison in construct validity and responsiveness |
Results
Study selection
The first search was performed on 25 February 2021; 5,056 records were screened, 369 full-text reports were assessed for eligibility and 78 were included. An updated search was performed on 30 December 2023; 1,110 records were screened, 104 full-text reports were assessed for eligibility and 17 were included. A total of 95 full-text articles were included in the review. Figure 1 outlines the study selection process and excluded reasons.
![PRISMA flow diagram for study selection.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/ageing/53/3/10.1093_ageing_afae055/1/m_afae055f1.jpeg?Expires=1722364785&Signature=wwo7-osZf4lfF859lnK4BB77ADjW9lxmjQSbNYiJKYvCGhnKQo5ewbMNtt0paG9YmmdNW0sZwn2ADGZGi3eQJe1a-Ylf2DTunLlUnEPcSDNTOX6fcXS52LAp3nA5X3gZDrRv0g3N9L~FACW9DE0p5W33lOVZCSqKraqOmPIt6fK3LsrXxFVDxBIoxSTbkUuUU~Rsh84E0VkZDU-q0FkU6f0keLarxk4o-mn7HDMV6VeK3SLtCCRoyJqZ8kPBCVFcUbmbndWhVbj5XAR9yy5hTsZH54Ql~Y9NAVKG8o9e-qyhDj~6IJkIaDRMHOiUevPU4DYzbA4blwZF04ecHc0q5g__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Study characteristics
Appendix 3 provides an overview of study characteristics for all 95 included articles. Gait speed length ranged from 2.4 [18] to 20 m [19–22], with 4 m being the most common length (~34% of articles reported on it). Mean gait speed for usual pace ranged from 0.55 [23] to 1.59 m/s [24]. Mean gait speed for fast pace ranged from 0.70 [25] to 2.24 m/s [26]. Sample sizes ranged from 15 [27] to 18,632 participants [19]. Mean age of participants ranged from 65.8 [28] to 89 years [29]. Mean number of morbidities ranged from 1.1 [30] to 6.4 [31] conditions.
Interpretability
Forty-four articles reported on the interpretability of the gait speed test (Appendix 3). Minimal important change values ranged from 0.03 [32, 33] to 0.12 m/s [33] for usual gait speed and 0.12 to 0.20 m/s [34] for fast gait speed. Minimal detectable change ranged from 0.01 [35] to 0.3 m/s [27] for usual gait speed and 0.15 [36] to 0.61 m/s [27] for fast gait speed. Cut-offs for health outcomes for the usual gait speed ranged from 0.47 m/s for activities of daily living (ADL) difficulties [37] to 1.41 m/s for pre-frailty [26]. For the fast gait speed, cut-offs ranged from 1.13 m/s for falls [38] to 2.33 m/s for pre-frailty [26].
Measurement properties
Seventy-nine articles evaluated the measurement properties of the gait speed test, with 52 articles reporting results on the usual speed, 6 reporting on the fast speed and 16 reporting on both. Five articles did not specify a usual or fast speed for the test and, thus, did not undergo data synthesis. The gait speed protocol, result and risk of bias for each study can be found in Appendixes 4 and 5. Table 2 outlines the summary of results, overall ratings and quality of evidence for each measurement property evaluated.
Measurement property . | Usual gait speed . | Fast gait speed . | ||||
---|---|---|---|---|---|---|
. | Summary result . | Overall rating . | Quality of evidence . | Summary result . | Overall rating . | Quality of evidence . |
Intra-rater reliability | 95% of hypotheses met (+18, −1) ICC = 0.72–0.98 and r = 0.82 (for +18) | Sufficient | Low (very serious ROB) | 100% of hypotheses met (+11) ICC = 0.77–0.98 | Sufficient | Low (very serious ROB) |
Inter-rater reliability | 100% of hypotheses met (+2) ICC = 0.79–0.95 | Sufficient | Low (very serious imprecision) | 100% of hypotheses met (+1) ICC = 0.98 | Sufficient | Low (very serious imprecision) |
Measurement error | 63% of hypotheses met (+5, −3) SEM = 0.005–0.1 (+5, −1) | Sufficient; inconsistent | Very low (very serious ROB + serious inconsistency) | 25% of hypotheses met (+2, −6) SEM = 0.053–0.067 (+2) | Insufficient | Low (very serious ROB) |
Convergent/discriminant validity | 54% of hypotheses met (+49, −42) Correlation with walking measures: 0.13 (with daily life gait speed) to 0.93 (with 400 m); with measures of health/function: 0.09 (grip strength) to 0.72 (LLFDI-dl) With measures of depression and cognition: 0.02 (verbal fluency) to 0.74 (GDS) | Sufficient; inconsistent | Moderate (serious inconsistency) | 57% of hypotheses met (+20, −15) Correlation with walking measures: 0.12 (with daily life gait speed) to 0.60 (with SPPB) With measures of health/function: 0.07 (general health) to 0.69 (5STS) With measures of depression and cognition: 0.34 (GDS) to 0.70 (MoCA) | Sufficient; inconsistent | Moderate (serious inconsistency) |
Known-groups validity | 39% of hypotheses met (+14, −22) AUC for ADL difficulties = 0.68–0.91 (+4, −1) AUC for IADL difficulties = 0.66–0.83 (+1, −6) AUC for mobility limitations = 0.75–0.80 (+3) AUC for frailty = 0.85 (+1) AUC for fallers = 0.69 (−1) AUC for pre-frailty = 0.64 (−1) AUC for bone strength = 0.56–0.58 (−2) | Insufficient; inconsistent | Low (very serious inconsistency) | 40% of hypotheses met (+2, −3) AUC for fallers = 0.71 (+1) AUC for pre-frailty = 0.61 (−1) AUC for bone strength = 0.61–0.59 (−2) | Insufficient; inconsistent | Low (very serious inconsistency) |
Concurrent validity | 50% of hypotheses met (+13, −13) With longer gait speed: ICC = 0.79–0.93 (+7) Manual and automatic: r = 0.33–0.73 (+1, −4) | Sufficient; inconsistent | Moderate (serious inconsistency) | 14% of hypotheses met (+1, −6) With longer gait speed: ICC = 0.87 (+1) | Insufficient | High |
Predictive validity | 51% of hypotheses met (+41, −39) Frailty: AUC = 0.75–0.87 (+2) Falls: AUC = 0.57–0.81 (−5, +5) Mortality: AUC = 0.69–0.75 (−1, +2) Hospitalisation: AUC = 0.62–0.72 (−4, +1) EQ-5D: AUC = 0.59–0.67 (−2) | Sufficient; inconsistent | Very low (very serious ROB + serious inconsistency) | 78% of hypotheses met (+7, −2) Mortality: AUC = 0.58 (−1) | Sufficient | Low (very serious ROB) |
Responsiveness | 82% of hypotheses met (+9, −2) | Sufficient | Low (very serious ROB) | 0% of hypotheses met (−4) | Insufficient | Low (very serious ROB) |
Measurement property . | Usual gait speed . | Fast gait speed . | ||||
---|---|---|---|---|---|---|
. | Summary result . | Overall rating . | Quality of evidence . | Summary result . | Overall rating . | Quality of evidence . |
Intra-rater reliability | 95% of hypotheses met (+18, −1) ICC = 0.72–0.98 and r = 0.82 (for +18) | Sufficient | Low (very serious ROB) | 100% of hypotheses met (+11) ICC = 0.77–0.98 | Sufficient | Low (very serious ROB) |
Inter-rater reliability | 100% of hypotheses met (+2) ICC = 0.79–0.95 | Sufficient | Low (very serious imprecision) | 100% of hypotheses met (+1) ICC = 0.98 | Sufficient | Low (very serious imprecision) |
Measurement error | 63% of hypotheses met (+5, −3) SEM = 0.005–0.1 (+5, −1) | Sufficient; inconsistent | Very low (very serious ROB + serious inconsistency) | 25% of hypotheses met (+2, −6) SEM = 0.053–0.067 (+2) | Insufficient | Low (very serious ROB) |
Convergent/discriminant validity | 54% of hypotheses met (+49, −42) Correlation with walking measures: 0.13 (with daily life gait speed) to 0.93 (with 400 m); with measures of health/function: 0.09 (grip strength) to 0.72 (LLFDI-dl) With measures of depression and cognition: 0.02 (verbal fluency) to 0.74 (GDS) | Sufficient; inconsistent | Moderate (serious inconsistency) | 57% of hypotheses met (+20, −15) Correlation with walking measures: 0.12 (with daily life gait speed) to 0.60 (with SPPB) With measures of health/function: 0.07 (general health) to 0.69 (5STS) With measures of depression and cognition: 0.34 (GDS) to 0.70 (MoCA) | Sufficient; inconsistent | Moderate (serious inconsistency) |
Known-groups validity | 39% of hypotheses met (+14, −22) AUC for ADL difficulties = 0.68–0.91 (+4, −1) AUC for IADL difficulties = 0.66–0.83 (+1, −6) AUC for mobility limitations = 0.75–0.80 (+3) AUC for frailty = 0.85 (+1) AUC for fallers = 0.69 (−1) AUC for pre-frailty = 0.64 (−1) AUC for bone strength = 0.56–0.58 (−2) | Insufficient; inconsistent | Low (very serious inconsistency) | 40% of hypotheses met (+2, −3) AUC for fallers = 0.71 (+1) AUC for pre-frailty = 0.61 (−1) AUC for bone strength = 0.61–0.59 (−2) | Insufficient; inconsistent | Low (very serious inconsistency) |
Concurrent validity | 50% of hypotheses met (+13, −13) With longer gait speed: ICC = 0.79–0.93 (+7) Manual and automatic: r = 0.33–0.73 (+1, −4) | Sufficient; inconsistent | Moderate (serious inconsistency) | 14% of hypotheses met (+1, −6) With longer gait speed: ICC = 0.87 (+1) | Insufficient | High |
Predictive validity | 51% of hypotheses met (+41, −39) Frailty: AUC = 0.75–0.87 (+2) Falls: AUC = 0.57–0.81 (−5, +5) Mortality: AUC = 0.69–0.75 (−1, +2) Hospitalisation: AUC = 0.62–0.72 (−4, +1) EQ-5D: AUC = 0.59–0.67 (−2) | Sufficient; inconsistent | Very low (very serious ROB + serious inconsistency) | 78% of hypotheses met (+7, −2) Mortality: AUC = 0.58 (−1) | Sufficient | Low (very serious ROB) |
Responsiveness | 82% of hypotheses met (+9, −2) | Sufficient | Low (very serious ROB) | 0% of hypotheses met (−4) | Insufficient | Low (very serious ROB) |
5STS, 5-repetition sit-to-stand; AUC, area under the curve; ADL, activities of daily living; EQ-5D, EuroQol 5 dimensions; GDS, Geriatric Depression Scale; ICC, intraclass correlation coefficient; IADL, instrumental activities of daily living; LLFDI-DL, Late-Life Function and Disability Instrument–Disability Limitation; MoCA, Montreal Cognitive Assessment; ROB, risk of bias; SEM, standard error of measurement; SPPB, Short Physical Performance Battery.
Measurement property . | Usual gait speed . | Fast gait speed . | ||||
---|---|---|---|---|---|---|
. | Summary result . | Overall rating . | Quality of evidence . | Summary result . | Overall rating . | Quality of evidence . |
Intra-rater reliability | 95% of hypotheses met (+18, −1) ICC = 0.72–0.98 and r = 0.82 (for +18) | Sufficient | Low (very serious ROB) | 100% of hypotheses met (+11) ICC = 0.77–0.98 | Sufficient | Low (very serious ROB) |
Inter-rater reliability | 100% of hypotheses met (+2) ICC = 0.79–0.95 | Sufficient | Low (very serious imprecision) | 100% of hypotheses met (+1) ICC = 0.98 | Sufficient | Low (very serious imprecision) |
Measurement error | 63% of hypotheses met (+5, −3) SEM = 0.005–0.1 (+5, −1) | Sufficient; inconsistent | Very low (very serious ROB + serious inconsistency) | 25% of hypotheses met (+2, −6) SEM = 0.053–0.067 (+2) | Insufficient | Low (very serious ROB) |
Convergent/discriminant validity | 54% of hypotheses met (+49, −42) Correlation with walking measures: 0.13 (with daily life gait speed) to 0.93 (with 400 m); with measures of health/function: 0.09 (grip strength) to 0.72 (LLFDI-dl) With measures of depression and cognition: 0.02 (verbal fluency) to 0.74 (GDS) | Sufficient; inconsistent | Moderate (serious inconsistency) | 57% of hypotheses met (+20, −15) Correlation with walking measures: 0.12 (with daily life gait speed) to 0.60 (with SPPB) With measures of health/function: 0.07 (general health) to 0.69 (5STS) With measures of depression and cognition: 0.34 (GDS) to 0.70 (MoCA) | Sufficient; inconsistent | Moderate (serious inconsistency) |
Known-groups validity | 39% of hypotheses met (+14, −22) AUC for ADL difficulties = 0.68–0.91 (+4, −1) AUC for IADL difficulties = 0.66–0.83 (+1, −6) AUC for mobility limitations = 0.75–0.80 (+3) AUC for frailty = 0.85 (+1) AUC for fallers = 0.69 (−1) AUC for pre-frailty = 0.64 (−1) AUC for bone strength = 0.56–0.58 (−2) | Insufficient; inconsistent | Low (very serious inconsistency) | 40% of hypotheses met (+2, −3) AUC for fallers = 0.71 (+1) AUC for pre-frailty = 0.61 (−1) AUC for bone strength = 0.61–0.59 (−2) | Insufficient; inconsistent | Low (very serious inconsistency) |
Concurrent validity | 50% of hypotheses met (+13, −13) With longer gait speed: ICC = 0.79–0.93 (+7) Manual and automatic: r = 0.33–0.73 (+1, −4) | Sufficient; inconsistent | Moderate (serious inconsistency) | 14% of hypotheses met (+1, −6) With longer gait speed: ICC = 0.87 (+1) | Insufficient | High |
Predictive validity | 51% of hypotheses met (+41, −39) Frailty: AUC = 0.75–0.87 (+2) Falls: AUC = 0.57–0.81 (−5, +5) Mortality: AUC = 0.69–0.75 (−1, +2) Hospitalisation: AUC = 0.62–0.72 (−4, +1) EQ-5D: AUC = 0.59–0.67 (−2) | Sufficient; inconsistent | Very low (very serious ROB + serious inconsistency) | 78% of hypotheses met (+7, −2) Mortality: AUC = 0.58 (−1) | Sufficient | Low (very serious ROB) |
Responsiveness | 82% of hypotheses met (+9, −2) | Sufficient | Low (very serious ROB) | 0% of hypotheses met (−4) | Insufficient | Low (very serious ROB) |
Measurement property . | Usual gait speed . | Fast gait speed . | ||||
---|---|---|---|---|---|---|
. | Summary result . | Overall rating . | Quality of evidence . | Summary result . | Overall rating . | Quality of evidence . |
Intra-rater reliability | 95% of hypotheses met (+18, −1) ICC = 0.72–0.98 and r = 0.82 (for +18) | Sufficient | Low (very serious ROB) | 100% of hypotheses met (+11) ICC = 0.77–0.98 | Sufficient | Low (very serious ROB) |
Inter-rater reliability | 100% of hypotheses met (+2) ICC = 0.79–0.95 | Sufficient | Low (very serious imprecision) | 100% of hypotheses met (+1) ICC = 0.98 | Sufficient | Low (very serious imprecision) |
Measurement error | 63% of hypotheses met (+5, −3) SEM = 0.005–0.1 (+5, −1) | Sufficient; inconsistent | Very low (very serious ROB + serious inconsistency) | 25% of hypotheses met (+2, −6) SEM = 0.053–0.067 (+2) | Insufficient | Low (very serious ROB) |
Convergent/discriminant validity | 54% of hypotheses met (+49, −42) Correlation with walking measures: 0.13 (with daily life gait speed) to 0.93 (with 400 m); with measures of health/function: 0.09 (grip strength) to 0.72 (LLFDI-dl) With measures of depression and cognition: 0.02 (verbal fluency) to 0.74 (GDS) | Sufficient; inconsistent | Moderate (serious inconsistency) | 57% of hypotheses met (+20, −15) Correlation with walking measures: 0.12 (with daily life gait speed) to 0.60 (with SPPB) With measures of health/function: 0.07 (general health) to 0.69 (5STS) With measures of depression and cognition: 0.34 (GDS) to 0.70 (MoCA) | Sufficient; inconsistent | Moderate (serious inconsistency) |
Known-groups validity | 39% of hypotheses met (+14, −22) AUC for ADL difficulties = 0.68–0.91 (+4, −1) AUC for IADL difficulties = 0.66–0.83 (+1, −6) AUC for mobility limitations = 0.75–0.80 (+3) AUC for frailty = 0.85 (+1) AUC for fallers = 0.69 (−1) AUC for pre-frailty = 0.64 (−1) AUC for bone strength = 0.56–0.58 (−2) | Insufficient; inconsistent | Low (very serious inconsistency) | 40% of hypotheses met (+2, −3) AUC for fallers = 0.71 (+1) AUC for pre-frailty = 0.61 (−1) AUC for bone strength = 0.61–0.59 (−2) | Insufficient; inconsistent | Low (very serious inconsistency) |
Concurrent validity | 50% of hypotheses met (+13, −13) With longer gait speed: ICC = 0.79–0.93 (+7) Manual and automatic: r = 0.33–0.73 (+1, −4) | Sufficient; inconsistent | Moderate (serious inconsistency) | 14% of hypotheses met (+1, −6) With longer gait speed: ICC = 0.87 (+1) | Insufficient | High |
Predictive validity | 51% of hypotheses met (+41, −39) Frailty: AUC = 0.75–0.87 (+2) Falls: AUC = 0.57–0.81 (−5, +5) Mortality: AUC = 0.69–0.75 (−1, +2) Hospitalisation: AUC = 0.62–0.72 (−4, +1) EQ-5D: AUC = 0.59–0.67 (−2) | Sufficient; inconsistent | Very low (very serious ROB + serious inconsistency) | 78% of hypotheses met (+7, −2) Mortality: AUC = 0.58 (−1) | Sufficient | Low (very serious ROB) |
Responsiveness | 82% of hypotheses met (+9, −2) | Sufficient | Low (very serious ROB) | 0% of hypotheses met (−4) | Insufficient | Low (very serious ROB) |
5STS, 5-repetition sit-to-stand; AUC, area under the curve; ADL, activities of daily living; EQ-5D, EuroQol 5 dimensions; GDS, Geriatric Depression Scale; ICC, intraclass correlation coefficient; IADL, instrumental activities of daily living; LLFDI-DL, Late-Life Function and Disability Instrument–Disability Limitation; MoCA, Montreal Cognitive Assessment; ROB, risk of bias; SEM, standard error of measurement; SPPB, Short Physical Performance Battery.
Summary of results and ratings
Reliability
Twenty articles reported on the intra-rater reliability of the gait speed test and 17 underwent data synthesis, with 11 reporting on usual pace [30, 35, 39–47], 4 reporting on fast pace [25, 29, 36, 48] and 2 reporting on both [49, 50]. Intraclass correlations coefficients (ICCs) ranged from 0.72 to 0.98 for usual pace (except for 1 study [46] reporting an ICC of 0.64) and 0.77 to 0.98 for fast pace. Three articles reported on the inter-rater reliability of the gait speed test and 2 underwent data synthesis, where 1 reported on the usual pace [45] and 1 reported on both speeds [49]. ICCs ranged from 0.79 to 0.95 for usual pace and an ICC of 0.98 was reported for fast pace. For both intra- and inter-rater reliability, the overall rating for both speeds was sufficient.
Measurement error
Eight articles reported on the measurement error of the gait speed test and 6 underwent data synthesis, with 4 reporting on the usual pace [35, 42, 45, 46] and 2 reporting on the fast pace [29, 36]. The standard error of measurement (SEM) ranged from 0.005 to 0.1 m/s for usual pace and 0.053 to 0.067 m/s for fast pace. Limits of agreement (LOA) >0.1 m/s were reported for both speeds [29, 36, 42]. Usual gait speed had an overall sufficient rating and fast gait speed had an overall insufficient rating for measurement error.
Convergent/discriminant validity
Twenty-five articles reported on the convergent/discriminant validity of the gait speed test and underwent synthesis, with 16 reporting on the usual pace [23, 47, 51–64], 1 reporting on the fast pace [51] and 6 reporting on both [26, 28, 50, 65–68]. Usual gait speed had a wide range of correlations with other walking-based measures, ranging from 0.13 with daily life gait speed [57] to 0.93 with the 400 m walk test [47]. In comparison, fast gait speed had correlations ranging from 0.12 with daily life gait speed [66] to 0.60 with the Short Physical Performance Battery (SPPB) [50]. There were also wide ranges for usual and fast gait speed with measures of health and function that did not include a walking component (Appendix 4). For example, usual gait speed had correlations ranging from 0.09 with grip strength [26] to 0.72 with the Late-Life Function and Disability Instrument [65]. The fast gait speed test had correlations ranging from 0.07 with general health [51] to 0.69 with 5-repetition sit-to-stand [65]. Correlations with muscle mass were higher and in accordance with our hypotheses (0.30–0.31) [50] for fast gait speed compared with usual gait speed (0.06–0.18) [50, 69]. Correlations with measures assessing mental health/function, such as cognition and depression, were greater than expected, ranging from 0.02 [59] to 0.74 [65] for usual gait speed and 0.34 to 0.70 [65] for fast gait speed. Both speeds had an overall sufficient rating for convergent/discriminant validity.
Known-groups validity
Sixteen articles reported on the known-groups validity of the gait speed test and 15 underwent data synthesis, where 11 reported on the usual pace [22, 30, 37, 54, 70–76] and 4 reported on both speeds [26, 38, 67, 68]. Studies reported AUCs >0.70 for usual gait speed’s ability to discriminate between those with/without ADL/instrumental activities of daily living (IADL) difficulties [37, 70, 73], mobility limitations [30, 54] and frailty [72], and for fast gait speed’s ability to discriminate between fallers and non-fallers [38]. However, studies also reported AUCs <0.70 for usual gait speed’s ability to discriminate between those with/without ADL/IADL difficulties [37, 71, 73], and for both speeds’ ability to discriminate between those with/without pre-frailty [26] and bone strength [68]. Both speeds had an overall insufficient rating for known-groups validity.
Concurrent validity
Eight articles reported on the concurrent validity of the gait speed test and underwent synthesis, with 6 reporting on the usual pace [35, 39, 40, 58, 62, 77], 1 reporting on the fast pace [36] and 1 reporting on both [27]. For usual gait speed, studies reported ICCs of 0.79 to 0.93 and LOA greater than 0.1 m/s with longer gait speed tests [35, 39, 62]. For usual gait speed, correlations with automatic tests (e.g. automatic timers, walkway mats, accelerometers) ranged from 0.33 to 0.73 [40, 58, 77]. For fast gait speed, one study [36] reported an ICC of 0.87 with a longer gait speed test, with LOA >0.1 m/s. Peyrusque et al. [27] examined remotely administered gait speed with in-person administered and reported an ICC of 0.77 for usual pace and 0.62 for fast pace. Usual gait speed had an overall sufficient rating and fast gait speed had an overall insufficient rating for concurrent validity.
Predictive validity
Twenty-eight articles reported on the predictive validity of the gait speed test and 25 underwent data synthesis, where 20 reported on the usual pace [3, 31, 41, 43, 64, 78–92], 1 reported on the fast pace [51] and 4 reported on both [8, 20, 21, 93]. For usual gait speed, studies reported AUCs above 0.70 for its ability to predict frailty, falls, mortality and hospitalisation [41, 80, 81, 84, 92]; however, some studies also reported AUCs <0.70 for its ability to predict falls, hospitalisation and quality of life [31, 43, 83, 84, 92]. For fast gait speed, studies reported significant odds/hazard ratios for its ability to predict cognitive decline, mobility disability and functional dependence [8, 20, 21]; however, one study [51] reported an AUC <0.70 for its ability to predict mortality. Both speeds had an overall sufficient rating for predictive validity.
Responsiveness
Three articles reported on the responsiveness of the gait speed test and underwent synthesis, with 2 reporting on the usual pace [63, 89] and 1 reporting on the fast pace [94]. For usual gait speed, Mansson et al. [63] reported a correlation of 0.37 with SPPB and correlations >0.10 with measures of function over 4 months, and Beauchamp et al. [89] reported an effect size of 0.23 for decline in health status and 0.15 for increase in health status at 2 years. For fast gait speed, Lan et al. [94] reported AUCs <0.70 for gait speed’s ability to detect walking difficulties at 3 years. Usual gait speed had an overall sufficient rating and fast gait speed had an overall insufficient rating for responsiveness.
Risk of bias
Risk of bias for intra-rater reliability, measurement error, predictive validity and responsiveness of both speeds was very serious due to multiple studies of inadequate quality or one study of doubtful quality. There was no risk of bias for inter-rater reliability, convergent/discriminant validity, known-groups validity and concurrent validity of both speeds due to multiple studies of at least adequate quality or one study of very good quality.
Certainty of evidence
Figure 2 displays the summary of ratings and quality of evidence for each measurement property, separated by usual and fast speed. Very low to low quality of evidence was found for the reliability, measurement error, known-groups validity, predictive validity and responsiveness of both speeds. Moderate quality of evidence was found for the convergent/discriminant validity of both speeds and for the concurrent validity of usual gait speed. High quality of evidence was found for the concurrent validity of fast gait speed.
![Summary of ratings and quality of evidence for each measurement property, separated by speed. + indicates a sufficient rating; − indicates an insufficient rating. The y-axis represents the number of hypotheses (e.g. 1 article could report multiple correlation values for convergent validity).](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/ageing/53/3/10.1093_ageing_afae055/1/m_afae055f2.jpeg?Expires=1722364785&Signature=x6TFbiUDyTX8EcqSdE8HPE2RHV4v9FhdXWfSHW3nm30mVqjabw1s7vhVQ5QESMGoWTU9Wbjdr7uFn08Z~tpD9NGBNcIDaBSBZPr9ARcfIR5rBMlWa5RtCAnXlhtO2~AyZb7FhVnKhyn8PILLDgIlkZcU-sRfulxSbpoFIK7nVdU1z0WzjygkUQZ6DOCQ-qBF-Y5BK2Q1wlxa6rZzrsHBLe57nP12a5rw4C6YKuJsH6dj0BT17SIijvnOHMp4j-C7hoxdAsIO6LaKHRxcpAGONRivF5Z6wn0~EeCeQbWe62ZkpOU-r2uoGY41E~DoBPn3AOSIuCRh4WZeK435ODwntw__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Summary of ratings and quality of evidence for each measurement property, separated by speed. + indicates a sufficient rating; − indicates an insufficient rating. The y-axis represents the number of hypotheses (e.g. 1 article could report multiple correlation values for convergent validity).
Discussion
We performed a systematic review to examine the measurement properties of both the usual and fast-paced gait speed tests in community-dwelling older adults. Reliability, convergent/discriminant validity and predictive validity of both gait speed tests were found to be sufficient, thus supporting the stability of test scores over time and between raters, its ability to capture community-dwelling older adults’ mobility, and its ability to predict health outcomes in this population when used in clinical practice and research.
It is important to note that evidence for reliability and predictive validity were of lower quality due to design and methodological issues. For example, a common study design issue for intra-rater reliability studies was the inappropriate time lapse between assessments. Studies tended to administer assessments consecutively with little time lapse, which may have introduced fatigue between assessments. A common methodological issue for predictive validity studies was the use of regression analyses instead of AUCs, which are more appropriate as they can inform diagnostic accuracy. Studies with appropriate study designs and statistical analyses are recommended to confirm the reliability and predictive validity of the usual and fast gait speed tests.
Although there was strong evidence (sufficient ratings and moderate quality) to support the convergent validity of the gait speed tests, low correlations with daily life gait speed were found for both speeds. It is important to note that even though the review team hypothesised correlations >0.50 as both tests are measured in the same unit (i.e. m/s) and involve the same task (walking), none of the hypotheses with daily life gait speed were met. From this evidence, we can conclude that gait speed may not be reflective of daily walking speed in community-dwelling older adults. Moreover, based on our results for convergent validity, we can conclude that fast gait speed is more closely related to older adults’ skeletal muscle mass than usual gait speed and that gait speed and constructs of cognitive function and mental health are related in community-dwelling older adults.
Known-groups validity was found to be insufficient for both the usual and fast gait speed tests. However, some studies reported AUCs >0.70 for the gait speed’s ability to discriminate between participants’ physical health status, such as self-report mobility limitations (i.e. difficulty walking ¼ mile or climbing 10 steps) [30, 54]. Due to inconsistencies in ratings, more studies are needed to evaluate this property before conclusions can be made about gait speed’s ability to discriminate between different clinical characteristics.
Evidence to support concurrent validity and responsiveness was stronger for usual gait speed compared to fast. For concurrent validity, there was an overall sufficient rating and moderate quality of evidence supporting the usual gait speed’s ability to capture mobility as measured by a gold standard (i.e. a well-accepted protocol in the field or gait speed with a longer distance). However, available evidence indicated that fast gait speed does not accurately reflect mobility as measured by a gold standard. Even though there was sufficient responsiveness of the usual gait speed test, studies were of low quality for both speeds due to methodological issues, such as missing hypotheses on expected effect sizes. Future studies should consider outlining hypotheses around magnitudes for effect sizes a priori. Only 3 studies reported on the responsiveness of the gait speed test; thus, more measurement studies are needed to evaluate the responsiveness of the gait speed test over time and in relation to different interventions.
A common methodological issue concerning all properties was the absence of a detailed description of the gait speed test administered. Without information on the starting protocol or speed, it is difficult to identify which test procedure provides reliable and valid scores for community-dwelling older adults. This poses a challenge for clinicians/researchers in determining the appropriate design of the gait speed measure in their practice. Future measurement studies should incorporate a clear description of the protocol employed for the gait speed test.
This is the first systematic review to provide a synthesis of the measurement properties of the gait speed test in community-dwelling older adults following well-accepted, standardised guidelines. A systematic review by Rydwik et al. [5] evaluated the reliability, validity and responsiveness of gait speed in populations aged ≥60; however, the search was performed prior to the release of the COSMIN guidelines [10] and captured literature only up until January 2009. The secondary aim of our review was to provide interpretability values of the usual and fast gait speed test for community-dwelling older adults. Minimal important change values found for usual gait speed (0.03 to 0.12 m/s) were comparable to values reported in adults with different health conditions (0.10 to 0.20 m/s) [95].
Although included studies were diverse in sample characteristics, such as age, sex and country, and measure characteristics, such as length and starting protocols, there were more studies evaluating the usual pace compared with the fast pace. Thus, more evidence was available to inform conclusions about the measurement properties of the usual gait speed test. Another limitation in our review was the exclusion of gait speed tests that were measured using electronic devices, such as accelerometers, sensors or the GAITRite mat. Therefore, findings from this review are only applicable to gait speed tests that were timed manually, and it is recommended that reviews are performed to explore the measurement properties of automatic gait speed tests in community-dwelling older adults.
Conclusion
Sufficient results from good-quality studies supported convergent/discriminant validity for both the usual and fast gait speed tests, and concurrent validity for the usual gait speed test. Higher quality studies are needed to evaluate other measurement properties, such as the reliability and responsiveness, of both tests in community-dwelling older adults.
Declaration of Conflicts of Interest:
None.
Declaration of Sources of Funding:
None.
Comments