ORIGINAL RESEARCH
An appraisal of global clinical practice guidelines in thromboprophylaxis for superficial endovenous treatment
Whittley S,1 Salih M,1,2 Bowie J,1,2 Jubouri M,1,3 Onida S,1,2 Carradice D,4,5 Davies AH1,2
Plain English Summary
Why we undertook the work: Doctors are unsure about how to prevent blood clots in patients having treatment for varicose veins as there are differences in the clinical guidelines that doctors use. Conflicting guidelines can make it hard for doctors to decide on the best treatments, which can lead to worse patient outcomes and higher costs for healthcare. We wanted to examine the quality of these guidelines to identify their strengths and weaknesses.
What we did: We searched medical databases and other resources to find guidelines about preventing blood clots in varicose vein treatments. Four reviewers assessed these guidelines using a tool called AGREE II, which looks at the quality of guidelines in six areas, such as clarity, involvement of key contributors and how easy they are to apply in practice. We also checked how consistent the reviewers were in their evaluations and how different quality aspects relate to each other.
What we found: We found 10 guidelines published between 2014 and 2024 that met our criteria. Four of these guidelines were rated as high quality while six were low quality. There was a lot of variation in what these guidelines recommended for preventing blood clots. The scores showed that the guidelines were particularly weak in practical applicability. Our analysis showed that the reviewers agreed well on their ratings. We also found strong links between how clear the guidelines were, how involved stakeholders were and their overall quality.
What this means: The guidelines we looked at for preventing blood clots in varicose vein treatments have many inconsistencies and are based on low-quality evidence, making them less useful for doctors. By improving the quality and practical applicability of these guidelines, we can make them clearer and more effective. Future research should explore how the quality of these guidelines affects patient outcomes and gather feedback from doctors about how guideline inconsistencies influence their treatment choices.
Abstract
Introduction: Variability in clinical practice for pharmacological thromboprophylaxis in superficial endovenous interventions may reflect inconsistencies and ambiguities present in clinical practice guidelines (CPGs) for this patient cohort. Conflicting recommendations not only complicate clinical decision-making but can also negatively impact patient outcomes and impose unnecessary costs on healthcare providers. This study aimed to assess the quality of these guidelines using the Appraisal of Guidelines for REsearch & Evaluation II (AGREE II) instrument, highlighting strengths, weaknesses and areas for improvement.
Methods: A systematic search of Ovid Medline, Embase and grey literature was conducted to identify CPGs addressing pharmacological thromboprophylaxis in superficial endovenous interventions. Four independent assessors evaluated each guideline using the AGREE II tool across six domains: Scope and Purpose, Stakeholder Involvement, Rigour of Development, Clarity of Presentation, Applicability and Editorial Independence. Inter-reviewer reliability was calculated using the intraclass correlation coefficient (ICC) and a Pearson correlation analysis assessed associations among the domains.
Results: Ten guidelines published between 2014 and 2024 met the eligibility criteria. Among these, four (40%) were classified as high quality, specifically those from the National Institute for Health and Care Excellence (NICE), European Society for Vascular Surgery (ESVS), Scottish Intercollegiate Guidelines Network (SIGN) and the joint American Venous Forum (AVF), American Vein and Lymphatic Society (AVLS) and Society for Vascular Surgery (SVS). The remaining six guidelines were rated as low quality, with the Royal Society of Medicine (RSM) guideline scoring the lowest. Notable variability was observed in the scores, particularly within the Rigour of Development and Applicability domains, with the Applicability domain achieving the lowest mean score (33.4±26.0%). ICC values indicated good inter-reviewer reliability (ICC=0.81), with excellent agreement observed in the Stakeholder Involvement and Rigour of Development domains. Strong correlations between the Scope and Purpose, Stakeholder Involvement and Rigour of Development domains suggest that these aspects of guideline quality are interrelated.
Conclusions: The assessed guidelines for pharmacological thromboprophylaxis in superficial endovenous interventions exhibit considerable inconsistencies and a reliance on low-quality evidence, which limits their applicability in clinical practice. Targeted improvements in the Rigour of Development and Applicability domains could enhance the clarity, quality and practical utility of these guidelines. Future research could focus on evaluating the impact of guideline quality on clinical outcomes and explore clinicians’ perspectives on guideline inconsistencies to better understand their influence on decision-making in this area.
Introduction
Clinical practice guidelines (CPGs) are systematically developed recommendations that aim to assist clinicians and patients in making informed decisions for specific clinical situations by evaluating the benefits and risks of various treatment options based on comprehensive evidence.1–3 To standardise clinical practices and ensure effective and consistent patient care, CPGs must be of high quality and regularly updated. Developing reliable and applicable recommendations requires rigorous methodologies and well-defined development strategies.4–7 However, the process behind guideline development can vary significantly, resulting in considerable differences in guideline quality, with some failing to meet basic standards.8–11 Lower-quality guidelines risk contributing to inconsistencies in clinical practice and potentially leading to suboptimal patient outcomes.9–12 Additionally, conflicts of interest in CPG development, including instances of pharmaceutical industry funding, raise concerns about the impartiality of recommendations,13 with financial conflicts sometimes inadequately disclosed and guidelines occasionally published without thorough peer review. Such issues can undermine the credibility and integrity of CPGs.
In the context of pharmacological thromboprophylaxis for superficial endovenous interventions, several CPGs have been published by key bodies, including the National Institute for Health and Care Excellence (NICE), the Scottish Intercollegiate Guideline Network (SIGN) and the European Society for Vascular Surgery (ESVS).14–16 Despite the availability of these guidelines, considerable variability in clinical practice persists globally,17,18 reflecting potential contradictions and ambiguities within the recommendations and creating challenges for clinicians making treatment decisions. Furthermore, high-quality evidence to guide patient selection, drug choice (eg, low-molecular-weight heparin or direct oral anticoagulants), dosing and treatment duration in superficial endovenous interventions remains limited. Although pharmacological thromboprophylaxis may reduce the incidence of deep vein thrombosis (DVT) in this patient population,19 its practical utility requires further examination – particularly considering the potential cost savings and reduction of adverse effects if it is found to be unnecessary.20–22
Previous studies have used the Appraisal of Guidelines for REsearch & Evaluation II (AGREE II) tool, a validated instrument for evaluating the methodological quality and reporting standards of CPGs.8,23,24 These assessments have highlighted persistent weaknesses in key areas, including stakeholder involvement and clinical applicability,8 emphasising the value of systematic appraisal approaches. AGREE II provides a standardised quantitative method for assessing guidelines and identifying areas where transparency or methodological rigour may be lacking, potentially limiting the clinical utility of CPGs.25–29 This study therefore aims to critically appraise CPGs for pharmacological thromboprophylaxis in superficial endovenous interventions using the AGREE II tool.
Methods
Search strategy and CPG identification
To identify relevant CPGs, a systematic search strategy was developed using the keywords: (Guideline*) AND (Varicose veins or superficial venous incompetence or venous insufficiency or chronic venous disease*). The search was conducted on Ovid Medline and Embase databases on 8 April 2024. The results were exported to Covidence software for screening.30
Two independent reviewers (SW and MS) conducted the title and abstract screening using pre-defined eligibility criteria (Table 1). These criteria aimed to capture not only guidelines meeting the Institute of Medicine’s definition,3 but also those widely used by clinicians, even if they fell outside this strict definition. Articles that met the initial screening requirements underwent full-text review by the same two reviewers to confirm their eligibility, with reasons for exclusion documented. Any conflicts between reviewers were resolved through discussion. Eligible guidelines were subsequently extracted for appraisal. The methods for CPG identification were reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.31 To ensure comprehensive identification of relevant CPGs, a grey literature search was also conducted on official websites of relevant organisations and societies, including NICE and the Royal Society of Medicine (RSM).32,33 These guidelines were screened separately using the same eligibility criteria.
Critical appraisal of eligible CPGs
The methodological quality of eligible CPGs was assessed using the AGREE II tool, a validated 23-item instrument organised into six domains that each evaluate different aspects of guideline quality.25 The AGREE II is widely recognised and has been approved by NICE, with previous applications in appraisals of guidelines related to vascular surgery and venous disease.23,24,32,34 The six domains are as follows: Domain 1 (Scope and Purpose) evaluates the overall aim of the guideline, the specific health questions addressed and the target population; Domain 2 (Stakeholder Involvement) assesses whether the guideline development involved appropriate stakeholders and represented the views of its intended users; Domain 3 (Rigour of Development) focuses on the methods used to gather and synthesise evidence, formulate recommendations and plan for updates; Domain 4 (Clarity of Presentation) reviews the language, structure and format of the guideline; Domain 5 (Applicability) considers the potential barriers and facilitators to implementation, strategies to improve uptake and resource implications of applying the guideline; and Domain 6 (Editorial Independence) ensures that the recommendation is not unduly biased by competing interests.7,25,27,28 An ‘overall assessment’ section is also included to rate the overall quality of each guideline and determine whether the reviewer would recommend it for use in clinical practice (Table 2).
Four reviewers (SW, MW, JB and MJ) were provided with a User Manual detailing how to assess and rate each item using the AGREE II instrument.7 Each reviewer independently assessed each guideline and rated each item on a 7-point scale from 1 (strongly disagree) to 7 (strongly agree). The lead reviewer (SW) served as the primary contact for any reviewer queries.
Scores were entered into an Excel template provided by the lead reviewer, and statistical analysis was performed using R Statistical Software. Overall domain scores were calculated following standard AGREE II methodology.7
The minimum possible domain score was calculated as follows:
(number of items in the domain) x (‘strongly disagree’ score [=1])
x (number of reviewers [=4])
While the maximum possible domain score was calculated by:
(number of items in the domain) x (‘strongly agree’ score [=7])
x (number of reviewers [=4])
These minimum and maximum possible domain scores are presented in Table 3. To generate scaled domain scores, the following formula was used:
Since AGREE II does not provide specific thresholds to differentiate guideline quality, cut-off values from similar appraisals were used to classify guidelines as high or low quality.23,35 Guidelines were classified as high quality if they met one of the following criteria: >50% in all six domains; >60% in five domains; >6% in Domain 3 and two other domains. Guidelines that did not meet any of these cut-offs were classified as low quality (Table 4).
Inter-reviewer reliability
Inter-reviewer reliability was assessed by calculating intraclass correlation (ICC) coefficients using R. A two-way random-effects model was used, given that the same four assessors rated all 10 guidelines. Absolute agreement was measured to evaluate consistency of ratings across reviewers. ICC interpretation was as follows: <0.5 indicated poor reliability, between 0.5 and 0.75 indicated moderate reliability, between 0.75 and 0.9 indicated good reliability and >0.9 indicated excellent reliability.36
Correlation analysis
To evaluate relationships between domain scores, Pearson correlation coefficients were calculated using scaled scores for each domain across the CPGs. A Pearson correlation coefficient ranges from –1 to +1, indicating the strength and direction of association between two variables, where +1 denotes a perfect positive relationship and –1 denotes a perfect negative one. Correlations were assessed at a significance level of p<0.05.
Results
Eligible CPGs
The systematic search performed on Ovid Medline and Embase identified 1287 articles, of which 330 were duplicates. An additional three articles were identified through the grey literature search. A total of 957 titles and abstracts were screened, of which 920 were excluded based on the general eligibility criteria. Forty articles underwent full-text review, of which 30 were excluded for the following reasons: not being a guideline or formal advice (n=1), lacking recommendations on pharmacological thromboprophylaxis (n=13), not available in the English language (n=2), having been superseded (n=10) or full-text being unavailable (n=4). This resulted in 10 CPGs being included in the final appraisal (Figure 1). Of these, nine met the AGREE II definition of a CPG.1,7 The RSM guideline,37 while not strictly a formal guideline, was included due to its widespread use in UK clinical practice and relevance to the study objective. For the purpose of this manuscript, it will be referred to as a CPG.
The 10 CPGs included were published between 2014 and 2024 and originated from regions including North America, Europe, the UK, Scotland, France and international organisations (Table 5). The CPGs represented a diverse range of institutions including government bodies (eg, NICE and SIGN),14,15 local and international scientific organisations (eg, ESVS and SVS) and medical societies (eg, RSM).16,37,38 Notably, only one guideline focused exclusively on venous thromboembolism (VTE) prophylaxis in varicose vein procedures,37 while others covered a broader scope. This included two guidelines on VTE prophylaxis,14,15 one on the management of varicose veins,39 one on the classification and treatment of endothermal heat-induced thrombosis,40 three on thermal ablation,41–43 one on sclerotherapy and one on the management of chronic venous disease of the lower limbs.16,44
Among the 10 CPGs, five recommended pharmacological thromboprophylaxis specifically for patients at high risk of VTE,15,39,41,42,44 three recommended an individualised approach,16,37,40 one recommended against routine administration of pharmacological thromboprophylaxis and one advised considering it only if anaesthesia time exceeded 90 minutes and the VTE risk outweighed the bleeding risk.14,43
Inter-reviewer reliability
The overall inter-reviewer reliability, measured by the ICC, was 0.81 (95% CI 0.534 to 0.944), indicating good agreement among the four assessors. ICC values for each domain across all guidelines are presented in Table 6. All domains had ICC values >0.5, suggesting good reliability. Domains 2 (Stakeholder Involvement) and 3 (Rigour of Development) exhibited the highest levels of agreement, with ICCs of 0.941 and 0.943, respectively, indicating excellent reliability. Domains 5 (Applicability) and 6 (Editorial Independence) showed good reliability with ICCs of 0.825 and 0.824, respectively. Domains 1 (Scope and Purpose) and 4 (Clarity of Presentation) demonstrated moderate agreement, with ICCs of 0.664 and 0.552, respectively.
CPG methodological quality appraisal
The individual reviewer scores and scaled domain scores for each CPG are presented in Table 7. Since the ‘overall assessment’ score is a separate summary score reflecting the assessors’ overall judgment of the CPG rather than being a formal domain, it was excluded from the analysis and the scores are instead presented in Appendix 1 (see www.jvsgbi.com).
The mean scaled scores for each CPG were used to determine their methodological quality. Based on the quality cut-offs presented in Table 4, four guidelines (40%) – including those from NICE,14 ESVS,16 SIGN and the joint AVF/AVLS/SVS – were classified as high quality.15,38 The ESVS guideline achieved the highest mean scaled score (85.2±17.9%) and scored above 50% in all six domains.16 NICE was the second-highest scoring guideline14 with a mean scaled score of 83.2±16.6%, also scoring above 50% in all six domains. Notably, NICE and ESVS were the only CPGs to score >50% in all six domains.14,16 SIGN was the third highest ranking CPG15 with a mean scaled score of 80.3±25.2%, scoring above 60% in domains 1–5. The fourth highest scoring CPG was the joint AVF/AVLS/SVS guideline,38 which had a mean scaled score of 65.8±21.0%, scoring above 60% in domain 3 as well as domains 1, 2 and 4.
In contrast, six CPGs (60%) were classified as low quality, with the RSM guideline scoring the lowest (28.0±29.4%).37 The second lowest scoring CPG was the joint phlebological society guideline,44 which scored 43.0±28.6%. The remaining low-quality CPGs – including those from ECoP,41 FSVM,42 UIP and the joint AVF/SVS guidelines –had mean scores ranging from 45.2±30.2% to 60.0±24.4%.40,43
CPG performance in individual domains
Considerable heterogeneity was observed across assessor scores of CPGs in Domains 2, 3, 5 and 6, reflected in the large interquartile ranges (IQRs) seen in the boxplot presented in Figure 2. Notably, Domain 3 (Rigour of Development) had the widest IQR, indicating the highest variability in assessor scores in this domain. Domains 1 (Scope and Purpose) and 4 (Clarity of Presentation) had the narrowest IQRs (21.5 and 18.25, respectively), suggesting the highest level of agreement and lowest heterogeneity across assessor scores in these domains.
Domain 1 (Scope and Purpose) achieved the highest mean scaled score (86.9±12.3%), with all CPGs scoring highly (from 68% to 100%). Notably, NICE,14 ESVS and SIGN each achieved a perfect score of 100%.15,16 Even the lowest scoring guideline (RSM)37 still demonstrated high quality with a score of 68%. The ICC for Domain 1 was 0.664 (95% CI 0.113 to 0.907, p<0.05), indicating a moderate level of agreement between assessors, consistent with the narrow IQR of 21.5.
Domain 2 (Stakeholder Involvement) had a lower mean scaled score of 55.4±32.6%. This domain showed greater variability, with scaled domain scores ranging from 4% to 99%. The ICC for this domain was 0.941 (95% CI 0.826 to 0.984, p<0.05), indicating an excellent level of agreement among assessors. Notably, NICE,14 ESVS and SIGN exceeded 90%,15,16 while the RSM guideline scored the lowest at 4%.37 This domain had the largest IQR of 54.25, reflecting significant heterogeneity in how assessors rated the stakeholder involvement in the CPGs.
Domain 3 (Rigour of Development) had a wide range of scores (from 5% to 93%) and a mean scaled score of 49.3±31.1%. High quality ratings were achieved by four (40%) CPGs (NICE,14 ESVS,16 AVF/AVLS/SVS and SIGN).15,38 The ESVS guideline performed the best in this domain,16 with a mean scaled score of 93%, while four (40%) of the CPGs (ECoP,41 RSM,37 the joint European Phlebological Societies and the FSVM guideline) were classified as low quality.42,44 This domain had a high ICC of 0.943 (95% CI 0.848 to 0.984, p<0.05), indicating excellent inter-reviewer agreement.
Domain 4 (Clarity of Presentation) had a mean scaled score of 73.1±12.8%, making it the second highest scoring domain. Scaled domain scores ranged from 53% to 92%, with nine CPGs (90%) receiving high scores. The ECoP guideline was the only CPG to score moderately,41 with a scaled score of 53%. The ICC for this domain was 0.552 (95% CI –0.022 to 0.863, p<0.05), indicating moderate agreement between assessors, and the narrow IQR (18.25) indicates relatively consistent ratings across the assessors.
Domain 5 (Applicability) had the lowest mean scaled score (33.4±26.0%) and included the lowest individual score (4% by ECoP).41 Scaled scores for this domain ranged from 4% to 78%. Only two CPGs (20%) – NICE and SIGN – scored highly,14,15 while two others (20%) – ESVS and the joint AVF/AVLS/SVS guideline – were of moderate quality.16,38 The remaining six guidelines (60%) were classified as low quality in this domain. The ICC for Domain 5 was 0.825 (95% CI 0.553 to 0.951, p<0.05), indicating good agreement between assessors.
In Domain 6 (Editorial Independence), the mean scaled score was 51.5±27.1%. Three guidelines (30%) – ESVS, the joint AVF/SVS and ECoP guidelines – were considered high quality in this domain,16,40,41 while two guidelines – RSM and the joint phlebological societies guidelines – were rated as low quality.37,44 The ICC for Domain 6 was 0.824 (95% CI 0.532 to 0.951, p<0.05), indicating a good level of agreement between assessors despite a broad score range (13–88%).
Correlation analysis
Pearson correlation coefficients between the scaled scores for each domain are presented in Table 8. Strong positive correlations were observed between Domain 1 and Domain 2 (r=0.92, p<0.05), Domain 1 and Domain 3 (r=0.90, p<0.05), Domain 2 and Domain 3 (r=0.96, p<0.05) and Domain 4 and Domain 5 (r=0.95, p<0.05). These findings suggest that high performance in one of these domains is associated with similarly high performance in the others. Conversely, Domain 6 showed weak or negative correlations with most other domains, with the exception of a non-significant positive correlation with Domain 3 (r=0.44) and a non-significant negative correlation with Domain 4 (r=–0.17).
Discussion
The CPGs developed by major organisations demonstrated higher quality compared with those from smaller or less specialised institutions. The Scope and Purpose domain achieved the highest score, reflecting a clear emphasis across all CPGs on establishing a clear foundation for recommendations. In contrast, the Applicability domain scored the lowest, highlighting a significant gap in providing practical guidance for implementing recommendations in clinical practice. The limited focus on applicability – such as considerations of facilitators, barriers and resource implications – may hinder the practical adoption of these guidelines, particularly in healthcare settings with varying resource availability and protocols.45–47 It is important, however, to consider whether the performance of individual domains significantly impacts the overall usability of CPGs. While high scores in Scope and Purpose indicate well-defined guideline objectives, this does not necessarily translate to improved clinical implementation. Future research could explore whether high scoring domains correlate with guideline adherence in practice.
Our findings highlight substantial variability in the quality of CPGs. While some guidelines, particularly those from NICE, ESVS, AVF, AVLS, SVS and SIGN,14–16,38 exhibit strong methodological rigour and consistency, they also acknowledge limitations due to reliance on low-quality evidence and a lack of randomised controlled trial data. This raises the issue of weather guidelines based on poor evidence can still be clinically valuable. While these guidelines offer structured transparent decision-making frameworks, their recommendations may be largely opinion-based, reducing their clinical utility. In contrast, poorly developed guidelines based on the same weak evidence are less valuable, lacking rigorous evidence synthesis. Both types face similar challenges in supporting clinical decisions due to the absence of robust evidence. In such cases, guidelines may need to refrain from making recommendations when the evidence is insufficient to support a clear clinical direction. Relying on expert opinion or low-level evidence, though often necessary, risks blurring the line between evidence-based guidance and clinical advice. Therefore, clearly distinguishing between evidence-supported recommendations and those based on consensus is essential for ensuring transparency regarding their limitations.
Given these concerns, the AGREE II tool could be refined to assess whether the strength of evidence justifies a recommendation. While it is effective in evaluating guideline quality, it does not address the appropriateness of issuing recommendations based on weak or limited evidence. Incorporating criteria to evaluate whether evidence sufficiently supports a recommendation could improve the tool’s utility in clinical guideline development. Ultimately, when recommendations rely primarily on expert opinion or best guesses, they function more as advisory statements than true evidence-based guidelines. This is particularly relevant for pharmacological thromboprophylaxis in superficial endovenous interventions, where most recommendations are weak, emphasising the need for high-quality research to inform future guidelines. This gap in evidence is one that the ongoing THRIVE (THRomboprophylaxis in Individuals undergoing superficial endoVEnous intervention) trial aims to address.48,49
The inconsistency in recommendations across guidelines further complicates clinical decision-making. While some CPGs recommend thromboprophylaxis only for high-risk patients,15,39,41,42,44 others advocate for an individualised approach and some advise against routine use,16,37,40,43 recommending it only when anaesthesia time exceeds 90 minutes and the VTE risk outweighs the bleeding risk.14 Notably, the ESVS guidance on thrombosis does not provide specific recommendations on post-procedural thromboprophylaxis,16 instead advocating for ‘individualised thromboprophylaxis’, highlighting the need for stronger evidence in this area. This variability in recommendations reflects weaknesses in guideline development and limits their applicability, making it difficult for clinicians to implement consistent evidence-based thromboprophylaxis strategies across diverse patient populations and healthcare settings. The resulting ambiguity fosters uncertainty, complicating clinical decision-making for patients undergoing superficial endovenous interventions.
The majority of guidelines advise offering pharmacological thromboprophylaxis to high-risk patients; however, the criteria for defining high risk in this patient cohort are unclear.50 Guidelines recommending individualised approaches similarly lack specific scenarios for application, and while the ESVS and joint AVF/AVLS/SVS guidelines suggest routine risk stratification,16,38 they do not specify tools or criteria for identifying ‘high-risk’ status. In practice, many clinicians use the Department of Health and Caprini risk assessment tools,51,52 although no validated tool exists for this population. These inconsistencies reflect a lack of consensus, creating challenges for clinicians applying these guidelines in real-world settings.
Using the AGREE II instrument with four independent assessors strengthened the reliability of this evaluation. However, the relatively small number of included CPGs and the lack of guidelines specifically focused on superficial endovenous interventions may limit the generalisability of these findings. Although AGREE II is a valuable tool for assessing guideline quality, it lacks specific thresholds to distinguish high- from low-quality guidelines, leaving the overall assessment rating largely to the assessors’ subjective judgement. Establishing clear thresholds within AGREE II to differentiate guideline quality could improve the consistency of assessments and provide assessors with clearer guidance in their evaluations.25
Consensus statements were excluded from this study as they are not official guidelines.3 However, despite not meeting rigorous criteria for systematic guideline development, the RSM guideline was included given its widespread use and clinical relevance in the UK.37 While lacking a strong methodological foundation, it was developed by a reputable medical society and offers practical recommendations aligned with the focus of this study. Its inclusion allows for a more comprehensive assessment of available guidance on pharmacological thromboprophylaxis for this patient cohort. Despite its practical utility, the RSM guideline had the lowest methodological quality score of all 10 CPGs, reflecting limited stakeholder involvement, weak development processes and a lack of transparency. This highlights both the existing gaps in high-quality guidance for pharmacological thromboprophylaxis in superficial endovenous procedures and the reliance on lower-quality sources in routine clinical decision-making.
Strong correlations among Scope and Purpose, Stakeholder Involvement and Rigour of Development domains suggest these aspects of quality are closely related. This could indicate that a well-defined scope and purpose promote rigorous development and comprehensive stakeholder involvement. Guidelines performing well in one domain tend to perform well in others, indicating that these elements may reinforce each other. Editorial Independence, however, showed weak or negative correlations with most domains, suggesting that it represents a distinct quality aspect not directly related to other domains. This may indicate inconsistent addressing of editorial independence across guidelines, irrespective of overall rigour or clarity. Previous research has highlighted that the Rigour of Development domain is a significant predictor of overall guideline quality,53,54 and focusing on this domain could enhance CPG quality.35,55 An extension of the AGREE II tool, specifically tailored to surgical guidelines, has been proposed to address limitations in surgical guideline development and provide a more suitable framework for high-quality guideline development.56,57
While clear reporting in CPGs is crucial for transparency,35 strong reporting alone does not ensure robust methodological quality.58 A guideline may be well reported but lack methodological rigour,59 a distinction seen in systematic reviews where separate tools assess methodological quality and reporting transparency.31,60,61 Applying a similar approach to CPGs would support guidelines that are both clearly reported and methodologically robust. Collaboration between AGREE II and GRADE (Grading of Recommendations, Assessment, Development and Evaluations) has been suggested to develop unified standards that would improve guideline development and appraisal.62 Although both AGREE II and GRADE carry some subjectivity, GRADE provides a transparent framework for evaluating evidence certainty, requiring authors to justify their ratings, particularly in cases of downgrading.49 AGREE II, in contrast, does not require assessors to document specific reasons for their domain or overall assessments.25 Establishing predefined criteria for AGREE II item judgements may help raters reach consensus, especially when discrepancies arise.35
This review was limited to guidelines available in English. While this approach ensures consistency in evaluation and reduces potential translation biases, it excludes guidelines from non-English-speaking regions such as China, India and Japan, as well as those not available in English that were excluded during the full-text review.63,64 Research on the impact of including non-English articles in analyses has yielded mixed results.65–67 Consequently, our findings may have limited global applicability, particularly in regions with different healthcare systems and clinical practices. To address this limitation, future studies could incorporate translated versions of non-English guidelines or involve multilingual reviewers to broaden the scope and comprehensiveness of guideline appraisals.
Conclusions
Overall, the guidelines for pharmacological thromboprophylaxis in superficial endovenous interventions are often inconsistent, ambiguous and largely supported by low-quality evidence. Key domains, particularly Rigour of Development and Applicability, would benefit from targeted improvements to enhance the clarity and practical utility of these guidelines. High-quality clear guidance is essential to support effective clinical decision-making and ultimately improve patient outcomes. Future research may include evaluating how guideline quality affects patient outcomes or conducting qualitative studies with clinicians to further explore how inconsistencies in guidelines impact clinical decisions.
Article DOI:
Journal Reference:
J.Vasc.Soc.G.B.Irel. 2025; Online ahead of publication
Publication date:
April 23, 2025
Author Affiliations:
1. Academic Section of Vascular Surgery, Department of Surgery and Cancer, Imperial College London, Charing Cross Hospital, London, UK
2. Imperial Vascular Unit, Imperial College Healthcare NHS Trust, St Mary’s Hospital, London, UK
3. Hull York Medical School, Hull, UK
4. Academic Vascular Surgery Unit, Hull York Medical School, Hull, UK
5. Department of Vascular Surgery, Hull University Teaching Hospitals NHS Trust, Hull, UK
Corresponding author:
Alun Huw Davies, Professor of Vascular Surgery, Department of Surgery and Cancer, Imperial College, London W6 8RF, UK
Email: [email protected]