Analyzing hundreds of thousands of letters, emails, and phone calls between legislators and federal agencies.
This repository contains code to merge, augment, and analyze data on congressional correspondence with the federal bureaucracy.
ID and as well as a LetterID that is unique to each letter or phone call.members/nameCongress.R #9 and committee membership data are augmented from Charles Stewart III and Jonathan Woon, Congressional Committee Assignments, 103rd to 114th Congresses, 1993–2017, http://web.mit.edu/17.251/www/data_page.html in committees/committees.R #12Tasks recently completed:
MemberNameDateCorrections.R script in the members folder #10Here are some tasks that anyone can do:
All datasheets must have these columns:
FROM is the column with the name(s) of the Member(s) of Congress that signed the letter. If names are in multiple columns, a new FROM column will be created in the script cleaning those data.DATE is the date of the letter (or the best approximation).SUBJECT is a summary of the letter’s content. If more than one column contains substantive information, these are added to SUBJECT in the script cleaning those data.Most datasheets have additional columns, such as the letter’s text, priority level, date of reply, or the person in the agency tasked with responding to the letter. Because such information is not consistent across agencies, these are dropped when sheets are merged. They can be added back in for a more detailed analysis of specific departments or agencies. For example, see the more detailed analysis of FERC.
Other columns required for applying the codebook are added by the function in prep sheets.R.
merge.RIf extractMemberName() fails to match:
pattern variablemembers data can be added in nameCongress.R or noted in #9MemberNameTypos.RextractMemberName() fails to find it, note this in #62Where there is insufficient information to identify a letter’s date or author, the NOTES column should include “FOIA” and commits tagging observations to FOIA should reference #76
Data that are ready for coding should have an open issue named “apply codebook to AGENCY”
Where there is insufficient information to code a letter, the NOTES column should include “FOIA” and #76 should be tagged in the “apply codebook” issue.

