House

The Congressional Memory Project

Legend to Obscure Icons
Search	Select	Clear	Stop	Pause	Play

The Congressional Memory was developed by the Internet Multicasting Service in 1995 as the keystone pavilion for the Reinventing Government area. We secured congressional press credentials, ran dedicated lines into the U.S. Capitol, and moved audio from the floor of the House and Senate back into our computer systems. At the same time, we secured access to the text of The Congressional Record.
Deb K. Roy of the MIT Media Lab did the bulk of the sophisticated processing that occurred once the data reached our studios. He wrote routines that automatically segmented the Congressional Record and identified inserted remarks, names of speakers, and other clues. He then wrote audio segmenting routines that broke the audio up by speaker, and speaker recognition routines that matched a segment against the database of known speakers.
The result for the user was a database of text and audio. Users could enter key word searches, such as "show me all Democrats from Idaho who spoke about the budget last week." The transcript of any relevant speeches would then be pulled up and the user could review the transcript and hear the attached audio. Because of the propensity of the U.S. Congress to edit verbatim transcripts, the two did not always match.
The demonstration you see here was produced by William R. Wallis, an undergraduate researcher at the MIT Media Lab. Only 15 minutes of audio is present here (as opposed to several hundred hours that were available during the world's fair) and we have taken off the keyword search of the record. The original implementation was done in Perl, C, WAIS databases, and several other subsystems. The current system is implemented soley in the Java language.
The java source.