Americanization of the Supreme Court of Canada?
Why Cross-Court Word Embeddings Require Canadian Legal NER
Canadian Political Science Association Conference (2026)

Abstract
This project proposes to measure the linguistic distance between the Supreme Court of Canada (SCC), the Supreme Court of the United States (SCOTUS), and the UK apex courts (House of Lords pre-2009; UK Supreme Court thereafter) using transformer embeddings of 23,320 majority opinions from 1942 to 2022, reading that distance as a proxy for cross-court convergence or divergence. Preliminary chunk-level embeddings of the SCC and SCOTUS corpora reveal a small but consistent within-court-versus-cross-court cosine gap. We contend that the separability evidence is an artifact of contamination rather than of doctrinal distance.
This “year leakage” constrains how any distance can be read and motivates the metadata-decontamination strategy that follow-up work will implement—a non-trivial undertaking given the current state of legal named-entity recognition (NER) available for Canadian legal corpora. Reaching that point will require additional engineering and, contingent on probe diagnostics, model training.
Next steps: data curation, human annotation, Canadian NER development, and further exploration.
Authors
PATRON Team
Copyright © 2026 – PATRON-ROTIP
