Lexington Concordance

Anyway, you get my point. Openness. Transparency. You will hold me accountable; you will hold Congress accountable.- then-candidate Barack Obama in Parma, Ohio, March 4 2008

Anyone making a career in politics winds up repeating themselves in short order. In or out of office, they advance the same perspectives, press the same or similar issues, focus on or ignore particular topics, urge the same sort of action on the same sort of situation. They have pet causes and personal interests. Less charitably, they're prone, as a class, to cliches and catchphrases. But any way you look at it, they tend to put the same words together with some frequency.

In Parma and in other speeches, Barack Obama promised, in the same or similar words, to televise drug price negotiations with pharmaceutical companies for all to see. He didn't, of course, instead giving them everything they could have wanted for a one-time concession. These many years later, there isn't much to do about that broken promise. But the frequency with which Obama the candidate talked about putting negotiations on C-SPAN, and when and how quickly he stopped as President, is more than just a bitter memory: it's data.

Repetitions and trends in political discourse can show individual speakers' constantly developing priorities, reactions, intra- and inter-party connections, and more -- another dimension along which we can evaluate the goals, principles, and effectiveness of those who seek to govern us. Where do they lead political discourse, and when do they follow at a comfortable distance? What causes do they keep fighting for, and what gets quietly forgotten or hauled out only for special occasions? Whose interests do they spend their time advancing? When someone tells us "look at my record!", how can we do that quickly and quantitatively?

Transcripts of public-directed speech like interviews, debates, and addresses at sub-national levels come from too many different sources to assemble easily into a corpus which will yield to analysis. However, a more limited collection is possible: the United States Congressional Record transcribes utterances of senators and representatives from all fifty states (plus the token representation enjoyed by the millions of citizens of our five essentially powerless territories and Washington DC itself).

The Lexington Concordance catalogues the speech of incumbent members of Congress, as well as the still-nationally-relevant Joe Biden and Kamala Harris, starting from the year 2000. It indexes repeated phrases or grams by frequency of use. Caveat lector:

  • An automated concordance can only show where, when, and how many times things are said. All sentiment is obliterated. For example, "medicare, all, ..." grams will appear for anyone discussing Medicare for All, whether in positive or negative terms. Clicking on a gram will show links to the contexts in which it appeared.
  • At present, only grams the speaker has repeated at least once over are shown, so the Concordance will not tell you if an individual member of Congress has ever said something (for that, search the Congressional Record directly). This is due to temporal rather than technical constraints, and grams automatically become visible once repeated.
  • Transcripts are necessarily incomplete and imperfect. The Congressional Record is a thorough transcription, but of course it doesn't cover statements made outside of session, and the process of filtering out procedural irrelevancies and the like involves fallible heuristics.

Obligatory Disclaimer

The Lexington Concordance is the after-hours brainchild of a lone programmer dabbling in natural language processing, and has not been approved or funded by anyone or any political party or concern.