Impact of Exercise Type and Dose on Pain and Disability in Knee Osteoarthritis: A Systematic Review and Meta-Regression Analysis of Randomized Controlled Trials
Abstract
Objective
To identify the optimal exercise program, characterized by type and intensity of exercise, length of program, duration of individual supervised sessions, and number of sessions per week, for reducing pain and patient-reported disability in knee osteoarthritis (OA).
Methods
A systematic review and meta-analysis of randomized controlled trials were performed. Standardized mean differences (SMDs) were combined using a random-effects model. Study-level covariates were applied in meta-regression analyses in order to reduce between-study heterogeneity.
Results
Forty-eight trials were included. Similar effects in reducing pain were found for aerobic, resistance, and performance exercise (SMD 0.67, 0.62, and 0.48, respectively; P = 0.733). These single-type exercise programs were more efficacious than programs that included different exercise types (SMD 0.61 versus 0.16; P < 0.001). The effect of aerobic exercise on pain relief increased with an increased number of supervised sessions (slope 0.022 [95% confidence interval 0.002, 0.043]). More pain reduction occurred with quadriceps-specific exercise than with lower limb exercise (SMD 0.85 versus 0.39; P = 0.005) and when supervised exercise was performed at least 3 times a week (SMD 0.68 versus 0.41; P = 0.017). No impact of intensity, duration of individual sessions, or patient characteristics was found. Similar results were found for the effect on patient-reported disability.
Conclusion
Optimal exercise programs for knee OA should have one aim and focus on improving aerobic capacity, quadriceps muscle strength, or lower extremity performance. For best results, the program should be supervised and carried out 3 times a week. Such programs have a similar effect regardless of patient characteristics, including radiographic severity and baseline pain.
Systematic reviews of randomized controlled trials (RCTs) indicate that exercise therapy reduces pain and patient-reported disability in patients with knee osteoarthritis (OA), but to date, the optimal exercise regimen has not been identified (1, 2). The effects of exercise programs in clinical trials are likely to vary, since the interventions differ substantially in type of exercise (aerobic, strengthening, etc.), intensity of exercise, duration of intervention, and number of sessions per week. Furthermore, the patients included are often heterogeneous in age, sex, body mass index (BMI), radiographic severity of OA, and degree of malalignment (2).
Meta-regression analysis aims to relate the treatment effects recorded in different trials to the characteristics of patients and interventions in those trials in order to explain heterogeneity (3). Instead of performing a large number of studies to evaluate different variables relating to type and dose of exercise therapy, meta-regression analyses using study-level covariates in a multivariate regression analysis can be used to explore the effect of different exercise programs (3). The results of these meta-regression analyses of studies of exercise therapy can help provide evidence-based recommendations of type and dose of exercise for patients with OA of the knee.
The aim of the present study was to analyze the effect of published exercise therapy interventions in order to identify the optimal exercise program, characterized by type and intensity of exercise, length of exercise program, duration of individual supervised sessions, and number of sessions per week, for reducing pain and disability in knee OA.
PATIENTS AND METHODS
Protocol
Study selection, eligibility criteria, data extraction, and statistical analysis were performed according to the Cochrane Collaboration guidelines (4). Eligibility criteria and data analysis were specified in a protocol, which is available online at http://www.sdu.dk/en/Om_SDU/Institutter_centre/Iob_Idraet_og_biomekanik/Forskning/Forskningsenheder/FoF/Ph,-d-,d,-d-,-projekter/MEREX_carsten.
As stated in the protocol, included trials were RCTs comparing at least one exercise group to a non-exercise intervention control group. Patients had to have (explicitly stated) OA in either one or both knees, as defined by the American College of Rheumatology (ACR) criteria (5). Trials that included patients with knee and/or hip OA were included if separate data on the knee were available. The outcomes for evaluation of clinical efficacy were pain and disability, as recommended by Outcome Measures in Rheumatology III (6).
Literature search
A search of the literature for studies published up to May 2012 was performed in the following bibliographic databases with no restriction on publication year or language: Medline via PubMed, EMBase via OVID, CINAHL (including preCINAHL) via EBSCO, PEDro, and the Cochrane Central Register of Controlled Trials.
Search methods and study selection
The patient group was identified by performing the following search strategy in Medline: (‘Osteoarthritis’ [MeSH] OR ‘Osteoarthrit*’ [tiab] OR ‘Osteoarthros*’ [tiab]) AND (‘Knee’ [MeSH] OR ‘Knee Joint’ [MeSH] OR ‘Knee’ [tiab]). Exercise therapy interventions were identified by searching ‘Exercise’ [MeSH] OR ‘Exercis*’ [tiab] OR ‘Walking’ [MeSH] OR ‘Walk*’ [tiab] OR ‘Running’ [MeSH] OR ‘Run*’ [tiab] OR ‘Muscle Contraction’ [MeSH] OR ‘Strengthening’ [tiab] OR ‘Cycling’ [tiab] OR ‘Weight lifting’ [MeSH] OR ‘Weight lifting’ [tiab] OR ‘Jogging’ [MeSH] OR ‘Jogging’ [tiab] OR ‘Swimming’ [MeSH] OR ‘Swimming’ [tiab] OR ‘Pool therapy’ [tiab] OR ‘Aquatic exercise’ [tiab] OR ‘Hydrotherapy’ [MeSH] OR ‘Hydrotherapy’ [tiab] OR ‘Gymnastic’ [MeSH] OR ‘Gymnastic*’ [tiab]. All terms were searched as both keywords [MeSH] and text words in titles and abstracts [TIAB]. In order to identify RCTs the following filters were used: clinical trial, comparative study, controlled clinical trial, and randomized controlled trial. All terms were searched as both keywords and text words in titles and abstracts. The above search strategy was adjusted for searches in EMBase, CINAHL, PEDro, and the Cochrane Central Register of Controlled Trials. In addition, reference lists from retrieved publications were screened. Two members of the study team (CJ and HL) independently scrutinized titles and abstracts. The full-text article was obtained if it was judged eligible by at least one reviewer, and eligibility of these full-text articles was judged independently by the same reviewers. Consensus on inclusion was reached by discussion.
Data collection
A customized data extraction form was developed for each of the outcomes (i.e., pain and disability). Data extraction was performed by the main author (CJ) using published data only. The following information was mandatory: authors, year of publication, and the number of patients allocated to exercise and control groups. Patient characteristics included age, BMI, and sex. Disease-specific factors were Kellgren/Lawrence (K/L) grade, baseline knee pain, and degree of malalignment of the affected leg. The therapeutic exercise program characteristics extracted were the number of supervised sessions, duration of intervention (weeks), type of exercise (aerobic, strengthening, performance, or a combination of different types of exercise), number of sessions per week, intensity, and length of exercise session. The intervention was classified as lower extremity performance exercise if the patients were practicing a specific activity with the lower extremity. The exercise programs were classified as consisting of a single type of exercise if at least 75% of the exercise session (including warm up and cool down) involved one type of exercise and as consisting of a combination of different types of exercise if several types of exercise with different aims were performed within the same session.
Summary measures
The effect was calculated as the standardized mean difference (SMD), allowing pooling and comparison of the various outcomes assessed in individual trials. The SMD was estimated as the difference between the mean change in the intervention and control groups divided by the pooled standard deviation (SD). The SD was extracted or estimated from the standard error, the 95% confidence interval (95% CI), P value, or other method recommended by the Cochrane Collaboration (4). The SMD was clinically interpreted as originally proposed by Cohen (7). An SMD of 0.2 was considered small, an SMD of ∼0.5 was considered moderate (and would be recognized as clinically important), and an SMD >0.8 was considered large (7). This estimate of the effect size has a slight bias, overestimating the effect size, and a correction factor was applied to convert the effect size to Hedges's g. If more than one patient-reported outcome was reported, a published list for extracting patient-reported outcomes on pain and disability for meta-analyses was used (8). The approach of Bliddal and Christensen (9) was applied, and the effect size measured in SMD was transformed into a visual analog scale (VAS) ranging from 0 to 100 mm. Furthermore, using the formula from Chinn in the Cochrane Handbook, the odds ratio and number needed to treat (NNT) were estimated (9-11).
Combining the results of individual studies
A meta-analysis was applied using a random-effects model. Heterogeneity was examined as between-study variation and calculated as the I2 statistic (12) measuring the proportion of variation (i.e., inconsistency) in the combined estimates due to between-study variance (13). An I2 value of 0% indicates that no inconsistency was seen between the results of individual trials, and an I2 value of 100% indicates maximal inconsistency. A relevant study-level covariate was defined as one able to decrease inconsistency measured as the I2 statistic (and thus the between-study variance [τ2]) (3).
Prediction intervals encompassing the effect of a future study with 95% certainty were estimated in order to evaluate the quantitative impact of inconsistency according to the method of Higgins et al using the between-study variance τ2 and the standard error (14). In order to illuminate whether exercise therapy characteristics influenced clinical efficacy, these variables were used as independent variables in meta-regression analyses (15).
Assessment of risk of bias
Study quality was assessed with regard to the risk of selection bias, performance bias, detection bias, attrition bias, and reporting bias. Two reviewers (CJ and HL) independently assessed whether each of the following domains were adequate (i.e., low risk of bias), unclear, or inadequate: sequence generation, concealment of allocation, blinding, incomplete outcome data addressed, selective outcome reporting, and other biases (4). Sensitivity analyses were performed on different levels of risk of selection bias, performance bias, detection bias, attrition bias, and selective reporting bias. Small study bias was analyzed using Egger's test (16).
Additional analyses
The impact of the amount of exercise was investigated by stratifying trials based on the number of supervised sessions into the following 3 groups: low (up to 12 sessions), intermediate (13–24 sessions), and large (>25 sessions).
RESULTS
Study selection
The literature search resulted in 2,418 publications, with 258 unique trials identified as potentially eligible (Figure 1). Ten articles could not be evaluated due to language (articles in Czech, Chinese, Taiwanese, and Japanese). Forty-eight trials met the criteria and were included in the analysis (Table 1).

Flow chart of the identification of trials for inclusion in the meta-analysis. RCT = randomized controlled trial; OA = osteoarthritis; ACR = American College of Rheumatology; SMD = standardized mean difference.
Author, year (ref.) | Participants (intervention)a | Participants (control)a | Intervention | Outcome (pain) | Outcome (disability) | Quality assessment | |||
---|---|---|---|---|---|---|---|---|---|
Seqv | Con | Data | SOR | ||||||
Aglamis et al, 2008 (32) | n = 17Age: 56.8% female: 100BMI: 34.2K/L grade: 3 | n = 17Age: 54.4% female: 100BMI: 32.1K/L grade: 3 | 12-week multicomponent training program (aerobic, functional strengthening, and flexibility) vs. control | WOMAC pain subscale | WOMAC disability subscale | AD | AD | IN | UC |
An et al, 2008 (33) | n = 14Age: 65.4% female: 100BMI 25.7K/L grade: NA | n = 14Age: 64.6% female: 100BMI: 25.4K/L grade: NA | 8 weeks of daily Baduanjin (traditional Chinese exercise) performed in the community during 30-minute classes vs. control | WOMAC pain subscale | WOMAC disability subscale | UC | UC | IN | UC |
Baker et al, 2001 (34) | n = 23Age: 69% female: 73.9BMI: 31K/L grade: 3 | n = 23Age: 68% female: 82.6BMI: 32K/L grade: 3 | 4-month home-based progressive strength training program vs. a nutrition education program (attention control) | WOMAC pain subscale | WOMAC disability subscale | AD | AD | IN | UC |
Bautch et al, 1997 (35) | n = 17Age: 66% female: 66.7BMI: 32.6K/L grade: 4 | n = 17Age: 70% female: 80BMI: 24.9K/L grade: 2 | 12 weeks of weekly exercise sessions (range of motion for upper and lower extremities and low-intensity walking) and educational program vs. educational program | VAS-pain | AIMS -physical activity | UC | UC | IN | UC |
Bennell et al, 2010 (36) | n = 45Age: 64.5% female: 51.1BMI: 27.5K/L grade: 3 | n = 44Age: 64.6% female: 45.5BMI: 28.4K/L grade: 3 | 12-week supervised home-based exercise program targeting the hip abductor and adductor muscles vs. control | WOMAC pain subscale | WOMAC disability subscale | AD | AD | UC | UC |
Bezalel et al, 2010 (37) | n = 25Age: 73.8% female: 68BMI: NAK/L grade: NA | n = 25Age: 73.7% female: 80BMI: NAK/L grade: NA | 4 weeks of a group education program once a week, followed by a self-executed home-based exercise program vs. control | WOMAC pain subscale | WOMAC disability subscale | AD | AD | AD | UC |
Börjesson et al, 1996 (38) | n = 34Age: 64% female: 50BMI: 28.4K/L grade: NA | n = 34Age: 64% female: 50BMI: 27.7K/L grade: NA | 5-week group training program (range of motion, strengthening) vs. control | VAS-walking | – | AD | UC | IN | UC |
Chang et al, 2012 (39) | n = 24Age: 65% female: 100BMI: 24.9K/L grade: 3 | n = 17Age: 70.8% female: 100BMI: 25.7K/L grade: 3 | 8 weeks of supervised exercise with elastic bands in addition to usual treatment vs. usual treatment | WOMAC pain subscale | WOMAC disability subscale | UC | AD | IN | UC |
Cheing et al, 2002 (40) | n = 15Age: 60.9% female: 86.7BMI: 29.6K/L grade: NA | n = 18Age: 64.1% female: 93.8BMI: 28.8K/L grade: NA | 5-week exercise program (quadriceps and hamstring strengthening) vs. TENS placebo stimulation | VAS-pain | – | UC | UC | IN | UC |
Ettinger et al, 1997 (41) | |||||||||
Aerobic walking program | n = 144Age: 69% female: 69BMI >30: 50%K/L grade: NA | n = 149Age: 69% female: 69BMI >30: 58.4%K/L grade: NA | 3-month facility-based aerobic walking program followed by 15-month home-based walking program (36 supervised sessions) vs. health education program | VAS-activity | Self-reported disability FAST | IN | IN | UC | UC |
Resistance exercise program | n = 146Age: 68% female: 73BMI >30: 49.3%K/L grade: NA | n = 149Age: 69% female: 69BMI >30: 58.4%K/L grade: NA | 3-month facility based resistance exercise program followed by 15-month resistance exercise program (36 supervised sessions) vs. health education program | VAS-activity | Self-reported disability FAST | IN | IN | UC | UC |
Evgeniadis et al, 2008 (42) | n = 24Age: 67.1% female: 70BMI: 34.7K/L grade: 3 | n = 24Age: 69.4% female: 83BMI: 33.5K/L grade: 3 | 3-week preoperative supervised home-based exercise program for strengthening trunk and upper extremities vs. standard preoperative care | SF-36 bodily pain | SF-36 physical function | IN | IN | IN | UC |
Fransen et al, 2001 (43) | n = 83Age: 67.0% female: 76BMI: 30.0K/L grade: NA | n = 43Age: 66.1% female: 67BMI: 28.3K/L grade: NA | 8 weeks of muscle strengthening, aerobic exercise, and stretching vs. home-based stretching and walking | WOMAC pain subscale | WOMAC disability subscale | IN | IN | IN | UC |
Foroughi et al, 2011 (44) | n = 26Age: 64% female: 100BMI: 31.9K/L grade: 3 | n = 28Age: 64% female: 100BMI: 33.2K/L grade: 2 | 6-month high intensity (80%) resistance exercise program vs. sham exercise program (minimal resistance during exercise) | WOMAC pain subscale | WOMAC disability subscale | UC | AD | IN | UC |
Gür et al, 2002 (31) | n = 17Age: 56% female: NABMI: 31.9K/L grade: 2.4 | n = 6Age: 57% female: NABMI: 32.3K/L grade: 2.5 | 8 weeks of muscle strengthening for knee extensor and knee flexor vs. control | VAS-activity | – | UC | UC | IN | UC |
Horstmann et al, 2000 (45) | n = 20Age: 51.8% female: 31.6BMI: 27.6K/L grade: NA | n = 20Age: 52.5% female: 21.0BMI: 28.7K/L grade: NA | 4 weeks of isokinetic exercise for improving strength and reducing pain | VAS-rest | – | UC | UC | IN | UC |
Huang et al, 2003 (46) | n = 99Age: 62% female: 70BMI: NAK/L grade: NA | n = 33Age: 62% female: 70BMI: NAK/L grade: NA | 8 weeks of muscle-strengthening exercise vs. control | VAS-activity | – | UC | AD | IN | UC |
Huang et al, 2005 (47) | n = 35Age: 65% female: 81BMI: NAK/L grade: NA | n = 35Age: 65% female: 81BMI: NAK/L grade: NA | 8 weeks of supervised isokinetic muscle-strengthening exercise vs. control | VAS-activity | – | UC | AD | IN | UC |
Jan et al, 2009 (17) | n = 71Age: 62.6% female: 69BMI: 24.9K/L grade: 2 | n = 35Age: 62.2% female: 68.5BMI: 24.9K/L grade: 2 | 8 weeks of weight-bearing and non–weight-bearing exercises compared to control | – | WOMAC disability subscale | AD | UC | AF | UC |
Jan et al, 2008 (29) | |||||||||
High resistance exercise program | n = 34Age: 63.3% female: 79BMI: 24.1K/L grade: 2 | n = 30Age: 62.8% female: 83BMI: 24.1K/L grade: 2 | 8 weeks of high-resistance exercise (60% of 1RM) vs. control | WOMAC pain subscale | WOMAC disability subscale | AD | UC | IN | UC |
Low-resistance exercise program | n = 34Age: 61.8% female: 79BMI: 24.0K/L grade: 2 | n = 30Age: 62.8% female: 83BMI: 24.1K/L grade: 2 | 8 weeks of low-resistance exercise (10% of 1RM) vs. control | WOMAC pain subscale | WOMAC disability subscale | AD | UC | IN | UC |
Keefe et al, 2004 (48) | |||||||||
Pain coping skill training and exercise training | n = 20Age: 60.2% female: 65BMI: NAK/L grade: NA | n = 18Age: 60.0% female: 50BMI: NAK/L grade: NA | 12 weeks of spouse-assisted pain coping skill training and exercise training (endurance, strength, and flexibility) vs. spouse-assisted pain coping skill training | AIMS-pain | – | UC | AD | IN | UC |
Exercise training | n = 16Age: 60.3% female: 38BMI: NAK/L grade: NA | n = 18Age: 57.6% female: 61BMI: NAK/L grade: NA | 12 weeks of exercise training (endurance, strength, and flexibility) vs. standard care | AIMS-pain | – | UC | AD | IN | UC |
Kovar et al, 1992 (49) | n = 52Age: 70.4% female: 77BMI: 29.0K/L grade: NA | n = 50Age: 68.5% female: 90BMI: 30.1K/L grade: NA | 8 weeks of supervised fitness walk and patient education vs. standard routine medical care | AIMS-pain | AIMS-physical activity | AD | UC | IN | UC |
Lee et al, 2009 (50) | n = 29Age: 70.2% female: 93BMI: 26.0K/L grade: 3 | n = 15Age: 66.9% female: 93BMI: 26.0K/L grade: 2 | 8 weeks of Tai Chi Qigong twice a week vs. control | WOMAC pain subscale | WOMAC disability subscale | AD | AD | AD | UC |
Lim et al, 2008 (51) | n = 53Age: 65.6% female: 57BMI: 28.6K/L grade: 3 | n = 54Age: 63.6% female: 54BMI: 29.3K/L grade: 3 | 12 weeks of quadriceps strengthening 5 days a week vs. control | WOMAC pain subscale | WOMAC disability subscale | AD | AD | AD | UC |
Lim et al, 2010 (52) | |||||||||
Aquatic exercise program | n = 26Age: 65.7% female: 88BMI: 27.9K/L grade: NA | n = 24Age: 63.3% female: 88BMI: 27.7K/L grade: NA | 8 weeks of aquatic exercise 3 times a week vs. control | BPI-pain | SF-36 physical composite score | UC | AD | UC | UC |
Land-based exercise program | n = 25Age: 67.7% female: 84BMI: 27.6K/L grade: NA | n = 24Age: 63.3% female: 88BMI: 27.7K/L grade: NA | 8 weeks of land-based exercise (lower limb strengthening, stretching, and joint mobilization) 3 times a week vs. control | BPI-pain | SF-36 physical composite score | UC | AD | UC | UC |
Lin et al, 2009 (53) | |||||||||
Strengthening exercise program | n = 36Age: 63.7% female: 69BMI: 23.9K/L grade: 3 | n = 36Age: 62.2% female: 72BMI: 24.7K/L grade: 3 | 8 weeks of non–weight-bearing strengthening exercise program for quadriceps 3 times a week vs. control | WOMAC pain subscale | WOMAC disability subscale | AD | AD | AD | UC |
Proprioceptive exercise program | n = 36Age: 61.6% female: 67BMI: 23.74K/L grade: 3 | n = 36Age: 62.2% female: 72BMI: 24.7K/L grade: 3 | 8 weeks of non–weight-bearing proprioceptive exercise 3 times a week vs. control | WOMAC pain subscale | WOMAC disability subscale | AD | AD | AD | UC |
Lund et al, 2008 (54) | |||||||||
Aquatic exercise program | n = 27Age: 65% female: 83BMI: 27.4K/L grade: NA | n = 27Age: 70% female: 66BMI: 26.1K/L grade: NA | 8 weeks of aquatic exercise (aerobic, strengthening, balance, and stretching exercise) twice a week vs. control | KOOS-pain | KOOS-ADL | AD | AD | AD | UC |
Land-based exercise | n = 25Age: 68% female: 88BMI: 23.7K/L grade: NA | n = 27Age: 70% female: 66BMI: 26.1K/L grade: NA | 8 weeks of land-based exercise (aerobic, strengthening, balance, and stretching exercise) twice a week vs. control | KOOS-pain | KOOS-ADL | AD | AD | AD | UC |
Maurer et al, 1999 (55) | n = 57Age: 66.3% female: 47BMI: NAK/L grade: NA | n = 56Age: 64.5% female: 36BMI: NAK/L grade: NA | 8 weeks of isokinetic quadriceps muscle training 3 times a week vs. educational sessions | WOMAC pain subscale | WOMAC disability subscale | AD | UC | IN | UC |
McCarthy et al, 2004 (56) | n = 111Age: 64.5% female: NABMI: 29.4K/L grade: NA | n = 103Age: 64.9% female: NABMI: 30.2K/L grade: NA | 8 weeks of a class-based exercise program twice a week in addition to home-based exercise vs. home-based exercise | WOMAC pain subscale | WOMAC disability subscale | UC | AD | UC | UC |
McKnight et al, 2010 (57) | n = 95Age: 51.9% female: 76BMI: 27.4K/L grade: 2 | n = 87Age: 52.6% female: 75BMI: 27.9K/L grade: 2 | 9 months of quadriceps strengthening exercise twice a week in addition to a self-management program vs. a self-management program alone | WOMAC pain subscale | WOMAC disability subscale | AD | AD | AD | UC |
Messier et al, 2004 (20) | |||||||||
Aerobic and strengthening exercise | n = 80Age: 69% female: 74BMI: 34.2K/L grade: 2.2 | n = 78Age: 69% female: 68BMI: 34.2K/L grade: 2.2 | 18-month exercise program (aerobic and strengthening exercise) 3 times a week vs. healthy lifestyle education | WOMAC pain subscale | WOMAC disability subscale | AD | UC | AD | AD |
Aerobic and strengthening exercise plus diet program | n = 76Age: 69% female: 74BMI: 34.0K/L grade: 2.3 | n = 82Age: 68% female: 72BMI: 34.5K/L grade: 2.3 | 18-month exercise program (aerobic and strengthening exercise) 3 times a week in addition to a diet program vs. a diet program | WOMAC pain subscale | WOMAC disability subscale | AD | UC | AD | AD |
Ni et al, 2010 (58) | n = 18Age: 62.9% female: 100BMI: 26.4K/L grade: NA | n = 17Age: 63.5% female: 100BMI: 26.7K/L grade: NA | 24 weeks of Tai Chi 2–4 times a week vs. attention control (wellness education, stretching) | WOMAC pain subscale | WOMAC disability subscale | AD | AD | IN | UC |
Peloquin et al, 1999 (59) | n = 59Age: 65.6% female: 71BMI: 29.8K/L grade: 2 | n = 65Age: 66.4% female: 69BMI: 29.8K/L grade: 2 | 3 months of 1-hour exercise sessions 3 times a week (aerobic exercise, resistance training, and stretching) vs. control | AIMS-pain | AIMS-walking and bending | AD | UC | IN | UC |
Petrella et al, 2000 (60) | n = 88Age: 72.9% female: 58BMI: NAK/L grade: 1.5 | n = 89Age: 74.6% female: 62BMI: NAK/L grade: 1 | 8 weeks of home-based progressive exercise program in addition to NSAIDs vs. NSAIDs | WOMAC pain subscale | WOMAC disability subscale | UC | UC | UC | UC |
Rooks et al, 2006 (61) | n = 22Age: 65% female: 50BMI: 35.7K/L grade: NA | n = 23Age: 69% female: 57BMI: 33.9K/L grade: NA | 6 weeks of preoperative water- and land-based exercise (cardiovascular, strength, and flexibility) vs. control (education) | WOMAC pain subscale | WOMAC disability subscale | UC | IN | IN | UC |
Rosemffet et al, 2004 (62) | n = 8Age: 60% female: 76.9BMI: 31.3K/L grade: NA | n = 8Age: 60% female: 76.9BMI: 32.9K/L grade: NA | 8-week exercise program (walking and resistance exercise) in addition to functional electrical stimulation vs. functional electrical stimulation | WOMAC pain subscale | – | UC | UC | IN | UC |
R⊘gind et al, 1998 (63) | n = 12Age: 69.3% female: 91BMI: 27.4K/L grade: 3 | n = 13Age: 73% female: 92BMI: 26.8K/L grade: 3 | 3 months of exercise training (general fitness, balance, coordination, stretching, and lower extremity muscle strength) and a daily home program vs. control | VAS- weight bearing | – | AD | UC | UC | UC |
Salli et al, 2010 (64) | |||||||||
Isokinetic exercise program | n = 23Age: 55.7% female: 83BMI: 31.5K/L grade: 2 | n = 24Age: 58.3% female: 79BMI: 32.8K/L grade: 2 | 8 weeks of isokinetic exercises 3 times a week vs. control | WOMAC pain subscale | WOMAC disability subscale | UC | AD | IN | UC |
Isometric exercise program | n = 24Age: 57.1% female: 83BMI: 32.7K/L grade: 2 | n = 24Age: 58.3% female: 79BMI: 32.8K/L grade: 2 | 8 weeks of isometric exercises 3 times a week vs. control | WOMAC pain subscale | WOMAC disability subscale | UC | AD | IN | UC |
Sayers et al, 2012 (65) | |||||||||
High-speed resistance exercise | n = 12Age: 66.9% female: 75BMI: 28.4K/L grade: 1.6 | n = 11Age: 68.4% female: 73BMI: 30.8K/L grade: 1.8 | 12 weeks of high-speed resistance exercise (40% of 1RM) 3 times a week vs. stretching 3 times a week | WOMAC pain subscale | WOMAC disability subscale | AD | AD | IN | UC |
Low-speed resistance exercise | n = 10Age: 65.9% female: 80BMI: 33.1K/L grade: 1.6 | n = 11Age: 68.4% female: 73BMI: 30.8K/L grade: 1.8 | 12 weeks of low-speed resistance exercise (80% of 1RM) 3 times a week vs. stretching 3 times a week | WOMAC pain subscale | WOMAC disability subscale | AD | AD | IN | UC |
Schilke et al, 1996 (66) | n = 10Age: 64.5% female: 85BMI: NAK/L grade: NA | n = 10Age: 68.4% female: 85BMI: NAK/L grade: NA | 8 weeks of muscle strengthening exercises 3 times a week vs. control | OASI-pain | OASI-mobility | AD | AD | IN | UC |
Swank et al, 2011 (67) | n = 36Age: 63.1% female: 67BMI: 35.9K/L grade: NA | n = 35Age: 62.6% female: 63BMI: 32.9K/L grade: NA | Short-term pre-rehabilitation (4–8 weeks) resistance training, flexibility, and step training 3 times a week vs. control | Pain-walking | – | UC | UC | AD | UC |
Talbot et al, 2003 (68) | n = 17Age: 69.6% female: 76.5BMI: 31.0K/L grade: 2.5 | n = 17Age: 70.8% female: 76.5BMI: 32.6K/L grade: 2 | 12 weeks of individualized instruction in the use of a pedometer in addition to Arthritis Self-Management program (1 hour a week) vs. Arthritis Self-Management program | VAS-pain intensity | – | AD | AD | IN | UC |
Thorstensson et al, 2005 (21) | n = 30Age: 54.8% female: 50BMI: 29.6K/L grade: 3 | n = 31Age: 57.3% female: 52BMI: 29.5K/L grade: 3 | 6 weeks of a high-intensity program twice a week (weight-bearing exercises, endurance, and lower limb strength) vs. control | KOOS-pain | KOOS-ADL | AD | AD | IN | AD |
Topp et al, 2002 (69) | n = 67Age: 64.6% female: 69BMI: NAK/L grade: NA | n = 35Age: 60.9% female: 80BMI: NAK/L grade: NA | 16 weeks of strength exercises (dynamic or isometric) vs. control | WOMAC pain subscale | WOMAC disability subscale | UC | AD | AD | UC |
Trans et al, 2009 (70) | |||||||||
Whole-body vibration exercise (balance platform) | n = 18Age: 58.7% female: 100BMI: 29.1K/L grade: NA | n = 17Age: 61.1% female: 100BMI: 30.2K/L grade: NA | 8 weeks of whole-body vibration exercise (balance platform) vs. control | WOMAC pain subscale | WOMAC disability subscale | AD | AD | AD | UC |
Whole-body vibration exercise (stable platform) | n = 17Age: 61.5% female: 100BMI: 29.2K/L grade: NA | n = 17Age: 61.1% female: 100BMI: 30.2K/L grade: NA | 8 weeks of whole-body vibration exercise (stable platform) vs. control | WOMAC pain subscale | WOMAC disability subscale | AD | AD | AD | UC |
Tsauo et al, 2008 (71) | n = 30Age: 62.8% female: 90BMI: 26.4K/L grade: 2 | n = 30Age: 61.1% female: 83BMI: 28.4K/L grade: 2 | 8 weeks of a sensorimotor program in addition to routine physical therapy compared to routine physical therapy | WOMAC pain subscale | WOMAC disability subscale | AD | AD | IN | UC |
Wang et al, 2011 (72) | |||||||||
Aquatic exercise program | n = 28Age: 66.7% female: 86BMI: 26.6K/L grade: NA | n = 28Age: 67.9% female: 85BMI: 26.6K/L grade: NA | 12 weeks of aquatic exercise (balance and flexibility) 3 times a week vs. control | KOOS-pain | KOOS-ADL | AD | AD | AD | UC |
Land-based exercise program | n = 28Age: 68.3% female: 89BMI: 25.4K/L grade: NA | n = 28Age: 67.9% female: 85BMI: 26.6K/L grade: NA | 12 weeks of land-based exercise (balance and flexibility) 3 times a week vs. control | KOOS-pain | KOOS-ADL | AD | AD | AD | UC |
Weidenhielm et al, 1993 (73) | n = 19Age: 64% female: 58BMI: 30K/L grade: 2 | n = 20Age: 63% female: 45BMI: 29.1K/L grade: 2 | 3 months of preoperative physiotherapy (range of motion, muscle strength) vs. control | VAS-walking | – | AD | UC | IN | UC |
Weng et al, 2009 (74) | n = 33Age: 64% female: 80BMI: NAK/L grade: NA | n = 33Age: 64% female: 80BMI: NAK/L grade: NA | 8-week exercise program (isokinetic muscular strengthening) 3 times a week vs. control | VAS | – | AD | AD | IN | UC |
- Seqv = sequence generation; Con = concealment of allocation; data = incomplete outcome data addressed; SOR = selective outcome reporting; WOMAC = Western Ontario and McMaster Universities Osteoarthritis Index; AD = adequate; IN = inadequate; UC = unclear; NA = not assessable; VAS = visual analog scale; AIMS = Arthritis Impact Measurement Scales; TENS = transcutaneous electrical nerve stimulation; FAST = Fitness Arthritis and Seniors Trial; SF-36 = Short Form 36; 1RM = one repetition maximum; BPI = Brief Pain Inventory; KOOS-ADL = Knee Injury and Osteoarthritis Outcome Score–Activities of Daily Living; NSAIDs = nonsteroidal antiinflammatory drugs; OASI = Osteoarthritis Screening Index.
- a The values for age are mean year, the values for body mass index (BMI) are the mean, and the values for the Kellgren/Lawrence (K/L) grade are the median, except for in the studies by Gür et al, Messier et al, and Sayers et al, where the K/L grade is the mean.
These 48 trials compared 59 exercise interventions with control treatments. The only study for which SMD for pain was not available was the study by Jan et al (17), leaving 47 of 48 trials, with a total of 4,028 patients, with sufficient data for estimating SMD on at least one pain outcome and 35 trials, with a total of 2,732 patients, with data on patient-reported disability (Table 1).
Study characteristics
The mean age of the patients in the included trials was on average 64.3 years (range 52.2–73.8 years). On average, 75% of the patients were women (range 26–100%). The mean BMI was 29.1 (range 24.0–34.8). Baseline pain score (transformed to a scale ranging from 0 [no pain] to 100) was available for 45 of 47 trials. The mean pain score at baseline was 46.3 (range 23.7–75.2). In 23 trials, which included a total of 1,378 patients, the K/L grade was reported (169 [12.3%] had K/L grade 1, 639 [46.4%] had K/L grade 2, 462 [33.5%] had K/L grade 3, and 108 [7.8%] had K/L grade 4). Eleven of these trials were classified as having patients with on average severe knee OA (median K/L grade ≥3), and 12 were classified as having patients with on average mild to moderate knee OA (median K/L grade ≤2.5) (Table 1).
Results of individual studies
Most trials showed a positive effect in reducing pain and disability. The effect sizes of the individual trials ranged from a small negative effect to a very large positive effect (Figure 2). (Also see Supplementary Figure 1, available on the Arthritis & Rheumatology web site at http://onlinelibrary.wiley.com/doi/10.1002/art.38290/abstract.) Negative effects were mostly observed in trials using a combination of different exercise types, and large positive effects were mostly observed with either resistance training or aerobic exercise.

Forest plot of the effect of exercise therapy on reduction in patient-reported pain stratified by type of exercise. Weights are from a random-effects analysis. Subtotal effects and overall effect are shown with both 95% confidence interval (95% CI) and prediction interval. nE = number of patients in the intervention group; mE = change in pain in the intervention group; sdE = standard error of change in the intervention group; nC = number of patients in the control group; mC = change in pain in the control group; sdC = standard error of change in the control group; SMD = standardized mean difference.
Synthesis of results
The overall pooled SMD for pain reduction was 0.50 (95% CI 0.39, 0.62) (P < 0.001) in favor of exercise, with substantial heterogeneity (I2 = 62.0%) which was also supported by the 95% prediction interval (−0.19, 1.20) (Figure 2). For disability, the corresponding values were 0.49 (95% CI 0.35, 0.63) (P = < 0.001, I2 = 68.8%), and the prediction interval was −0.29 to 1.27. The prediction interval showed the potential effect of the treatment when applied within an individual study setting, which differs from the average effect. The 95% prediction interval for the meta-analysis for both pain and disability contains values less than 0. This means that although on average the exercise therapy seems effective in reducing pain and disability, not all future individual trials can be expected to show beneficial effects of exercise therapy (18). The effect size measured in SMD was transformed into a VAS (0–100 mm), and the overall effect of exercise therapy on reduced pain was 8.5 mm (95% CI 6.5, 10.5) and on disability was 8.3 mm (95% CI 5.9, 10.7). The NNT for pain and disability was estimated to be 6 patients (95% CI 5, 7) and 6 patients (95% CI 5, 8), respectively.
Type of exercise
In the stratified meta-analysis, similar effects were found for exercise programs including either aerobic exercise (SMD 0.67 for pain and 0.56 for disability), resistance exercise (SMD 0.62 for pain and 0.60 for disability), or performance exercise (SMD 0.48 for pain and 0.56 for disability) (P values for difference in effect of exercise type were P = 0.773 for pain and P = 0.968 for disability) (Figure 2). (Also see Supplementary Figure 1, available on the Arthritis & Rheumatology web site at http://onlinelibrary.wiley.com/doi/10.1002/art.38290/abstract.) When the studies that evaluated only a single exercise type were pooled, the SMD for pain was 0.61 (95% CI 0.48, 0.75), and for the SMD for disability was 0.58 (95% CI 0.40, 0.75) but with large heterogeneity, both for pain (I2 = 60.0%) and disability (I2 = 72.5%). Exercise programs that included a combination of resistance, aerobic, and performance exercise were not significantly better than control treatments in reducing pain (SMD 0.16 [95% CI −0.04, 0.37], I2 = 44.0%) and had only a small effect in reducing disability (SMD 0.22 [95% CI 0.08, 0.37], I2 = 0%). The difference between exercise programs focusing on one type of exercise compared with programs mixing two or more types was significant for both outcomes (SMD for pain 0.45 [95% CI 0.20, 0.69], P < 0.001 and SMD for disability 0.36 [95% CI 0.13, 0.58], P = 0.002) in favor of using only one type of exercise (Figure 2). (Also see Supplementary Figure 1.)
Aerobic exercise
Heterogeneity in the aerobic exercise group could not be explained by using patient characteristics or disease-specific factors as covariates. However, heterogeneity was reduced for both pain and disability (I2 reduced to 9.8% and 52.5%, respectively) when the number of supervised sessions was used as the covariate in meta-regression analyses, indicating a positive dose-response effect. The SMD for pain reduction increased significantly with a larger number of supervised aerobic exercise sessions (slope 0.022 [95% CI 0.002, 0.043], P = 0.036) (Figure 3), but did not reach significance for disability (slope 0.021 [95% CI −0.008, 0.050], P = 0.125). (See Supplementary Figure 2, available on the Arthritis & Rheumatology web site at http://onlinelibrary.wiley.com/doi/10.1002/art.38290/abstract.) This means that for every additional 10 supervised sessions, the effect size of aerobic exercise for pain was increased by more than 0.2, which is comparable with the pain relief seen with acetaminophen for knee OA in a meta-analysis by Zhang et al (19). No statistically significant regression coefficients were found when the other exercise program characteristics were used as covariates in a meta-regression analysis.

Meta-regression analysis of the effect of aerobic exercise on reducing pain. Standardized mean differences of reduction in patient-reported pain (y-axis) and number of supervised exercise sessions (x-axis) are shown. Weights of included trials were based on the inverse of the total variance and are shown by the size of the circles.
Resistance exercise
No statistically significant regression coefficients were found using intensity, length of exercise program, number of supervised sessions, duration of individual supervised sessions, or number of sessions per week as covariates in a meta-regression analysis.
Heterogeneity in the resistance exercise group could not be explained by using patient characteristics or disease-specific factors as covariates. Stratified analyses showed that exercise programs focusing on quadriceps strength only were more beneficial in reducing pain than programs aimed at improving general lower limb strength (SMD 0.85 [95% CI 0.55, 1.14], I2 = 77.0% versus 0.39 [95% CI 0.27, 0.52], I2 = 5.7%) (P = 0.005). Programs focusing on quadriceps strength only were also more effective at reducing disability (SMD 0.87 [95% CI 0.45, 1.29], I2 = 85.6% versus 0.36 [95% CI 0.18, 0.52], I2 = 35.1%) (P = 0.029).
Disease severity
Stratified analysis showed similar effects for pain in patients with severe knee OA (SMD 0.60 [95% CI 0.38, 0.82], I2 = 36.1%) and those with mild/moderate knee OA (SMD 0.66 [95% CI 0.34, 0.99], I2 = 77.0%) (P = 0.736). Although exercise therapy seemed to reduce patient-reported disability less in patients with severe knee OA (SMD 0.39 [95% CI 0.05, 0.74], I2 = 73.6%) than in patients with mild/moderate knee OA (SMD 0.66 [95% CI 0.32, 0.99], I2 = 84.6%) (P = 0.282), the differences did not reach significance.
Age, sex, BMI, alignment in the knee, and baseline pain
None of the characteristics age, sex, BMI, alignment in the knee, or baseline pain significantly reduced heterogeneity when entered into the respective analyses. No analysis was performed on alignment in the knee since sufficient data were reported in only one trial.
Risk of bias
The risk of bias domains “sequence generation,” “concealment of allocation,” and “incomplete outcome data addressed” were assessed as adequate (i.e., low risk of bias) in 30 (62.5%), 31 (64.6%), and 13 (27.1%) of the trials included, respectively. Only two trials (20, 21) comparing 3 exercise interventions with controls were assessed as adequate in “selective outcome reporting” since most trials were not registered (i.e., in clinicaltrials.gov or similar) or lacked a published protocol. All other trials were therefore deemed unclear in “selective outcome reporting.” In exercise trials, the blinding of patients and therapist to group allocation is not possible, and thus none of the trials included performed this adequately. Since outcomes of interest were patient-reported, the blinding of the outcome assessor was not relevant to assess in these studies. No information was available on the risk of trial sponsors benefitting economically from a positive outcome. Thus, all trials were deemed unclear on “blinding” and “other biases,” and no further analyses were performed. When the studies were stratified according to the different aspects of study quality, no significant differences were found for “sequence generation,” “concealment of allocation,” or “incomplete outcome data addressed.” Analysis of the trials for “selective outcome reporting” showed statistically significant differences in favor of the subgroup classified as unclear for both pain (P = 0.015) and disability (P < 0.001). However, only two trials were clearly free of “selective outcome reporting” (see Supplementary Figures 3 and 4, available on the Arthritis & Rheumatology web site at http://onlinelibrary.wiley.com/doi/10.1002/art.38290/abstract) (20, 21). Risk of small study bias was significant for pain (P = 0.011) and marginal for disability (P = 0.075), by Egger's test.
Number of exercise sessions per week
Interventions consisting of a single exercise type with 3 or more sessions per week seemed to be more efficacious in reducing pain (SMD 0.68 [95% CI 0.51, 0.85], I2 = 60.8%) than those with less than 2 sessions per week (SMD 0.41 [95% CI 0.25, 0.55], I2 = 0%) (P = 0.017). Similar results were found for effect on disability for 3 or more sessions per week (SMD 0.67 [95% CI 0.44, 0.89], I2 = 76.9%) compared with less than 2 sessions per week (SMD 0.33 [95% CI 0.18, 0.49], I2 = 0%) (P = 0.017), but large heterogeneity was seen in the results of the studies that included more frequent exercise. Subgroup analyses according to the number of supervised sessions demonstrated some variations, but no statistically significant differences were found (see Supplementary Table 1, available on the Arthritis & Rheumatology web site at http://onlinelibrary.wiley.com/doi/10.1002/art.38290/abstract).
DISCUSSION
The main findings of this meta-analysis of 48 RCTs with a total of more than 4,000 patients were that exercise therapy programs focusing on a single type of exercise are more efficacious in reducing pain and patient-reported disability than those mixing several types of exercise with different goals within the same session; an increased number of supervised sessions enhances the benefits of aerobic exercise; exercise focusing on knee extensor muscle strength only may increase the benefits of resistance training; and exercise seems to be effective therapy for knee OA, regardless of age, sex, BMI, radiographic status, or baseline pain.
The importance of focusing on one type of exercise is supported by recently published meta-analyses (1, 22). One explanation of the disadvantage of mixing types of exercise with different goals within the same session may be the molecular response, where resistance training increases the myofibrillar protein response and aerobic exercise increases the content of mitochondria in the muscle. This molecular response decreased when both aerobic and resistance exercise were performed within the same session (23). However, in the early stages of a training program, the most commonly accepted explanation for the increase in muscle strength is an initial neuromuscular response, which is followed later on by muscle hypertrophy (24). Wilson et al found that the effect size for muscle hypertrophy is larger in strength training alone than in concurrent strength and endurance training in a recent meta-analysis of 21 studies (25).
The general pain relief following exercise therapy (and subsequent decrease in disability) could be due to the gait control mechanism (peripheral synaptic decrease in pain fiber activity due to motor neuron activity) or the central release of endorphins. Both explanations are related to the amount of exercise (26, 27). Indeed, amount of exercise (3 times a week, at least 12 supervised sessions) seems to be important for pain relief and reduction in disability. However, greater exercise intensity did not improve the effect of exercise therapy, as seen in direct comparisons in RCTs between high and low intensity exercise in aerobic and resistance exercise (28, 29), indicating the need for other theories to explain the effect of exercise therapy in reducing pain.
We found no support for individualization of exercise programs based on patient characteristics. For example, the effect of exercise therapy in reducing pain was not associated with the severity of knee OA, although there was large heterogeneity. These results support the findings of Wallis et al (30), who reported a moderate effect of exercise therapy in reducing pain in patients with knee OA awaiting total knee replacement, with an effect size that was comparable to the effect found for knee OA in general (1, 2, 30). Optimal exercise program characteristics were not identified. Except for the number of supervised sessions in aerobic exercise, exercise program characteristics did not explain differences in effect size in reducing pain and disability. Although only trials using the ACR classification criteria for OA were eligible, patients in the trials were heterogeneous. Stratified analysis based on radiographic or clinically diagnosed OA was not performed, since this was not reported in most trials. Using the ACR classification criteria, trials including patients with chronic knee pain were excluded from the analysis even though these patients may have chronic knee pain due to knee OA.
This study has some limitations. First, when the included trials were stratified and analyzed separately for the aerobic, resistance, performance, and mixed exercise groups, we expected heterogeneity to decrease. It was only reduced in the performance group, but remained very large in the other groups. Analyzing these exercise groups by using covariates in the meta-regression analyses did not reduce heterogeneity, except for the number of supervised exercise sessions in the aerobic exercise group. Due to the relatively small number of trials in the different exercise groups, the meta-regression analysis had to be performed first for exercise characteristics and second for patient characteristics and disease-specific covariates. Although a large number of covariates were analyzed, we could not explain most of the inconsistency in effect between trials. The fact that the heterogeneity remained large even after stratifying by exercise type reflects the large differences in the exercise program characteristics. Large variations were also seen in patient characteristics, both sociodemographic and disease specific. Furthermore, some of the trials could not be included in the meta-regression analysis of the different exercise characteristics since the intervention was not reported in detail. In order to investigate the impact of different exercise program characteristics, it is important that future exercise interventions are described in detail with regard to intensity, length of program, total number of supervised sessions, duration of individual supervised sessions, and number of sessions per week.
Second, the classification of trials according to type of exercise was performed by the main author only, which increased the risk of misclassification into exercise groups. However, our classification of exercise was similar to that reported in the review by Fransen and McConnell (2).
Third, combining SMD across trials could cause bias if trials with a homogeneous patient group (small SD) were combined with trials that had heterogeneous patient groups (large SD). Smaller trials are more likely to include homogeneous patient groups, and a small study bias is then present. A risk of small study bias was seen by Egger's test, but its significance was mainly due to the trial by Gür et al (31), which only had a very small weight in the meta-analysis (0.59%). In general, the methodological quality of the trials was moderate to low, but the sensitivity analysis of risk of bias showed that trials with a high risk of bias did not systematically overestimate the effect of exercise therapy.
Finally, 10 trials were excluded due to language. However, the 8 available abstracts of the excluded trials indicated that those trials showed similar effects of exercise therapy in reducing pain and disability compared to the included trials. Potential confounders such as differences in medication use and differences in co-interventions between the treatment groups were not seen in the included trials. Since the impact of these limitations is relatively small, the overall conclusion remains unchanged.
Optimal exercise programs for knee OA should have one aim and focus on improving aerobic capacity, quadriceps muscle strength, or lower extremity performance. In patients with poor aerobic capacity and muscle strength, aerobic exercise and strength training should be performed on different days in order to achieve the best effect. For best results, the program should be supervised, carried out 3 times weekly, and comprise at least 12 sessions. Such programs have similar effects regardless of patient characteristics, including radiographic severity of OA. These results should be confirmed in individual patient data meta-analyses investigating the dose-response relationship in knee OA.
AUTHOR CONTRIBUTIONS
All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. Juhl had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study conception and design. Juhl, Christensen, Roos, Zhang, Lund.
Acquisition of data. Juhl, Zhang, Lund.
Analysis and interpretation of data. Juhl, Christensen, Roos, Zhang, Lund.