Examining past disclosures in FracFocus

As I have mentioned in this blog, FracFocus does not maintain an audit trail as companies change already-published disclosures.  In fact, such changes are made without public notice or justification.

One way to see what those changes are would be to compare older archived data with more recent data.  While FracFocus only supplies bulk data for the current data state, we have been saving these bulk downloads since Fall 2018 and currently have around 250 separate downloads.  We are working on a resource that researchers can use to delve into that older data.

Number of disclosures in set of archived downloads

The two obvious research questions for these older data are: 1) what are the silent changes happening in the data and 2) what is the publication delay from the end of the fracking job to the appearance in FracFocus.   This latter question is interesting because many states have publication requirements of 30 or 60 days, and yet there are many cases where the delay is more like a few years.

In the FracFocus bulk download, there are two types of zip files. In the first type, the entire data set is split into roughly 25 zipped CSV files – these include all the chemical records.  The second type of zip file contains only the meta data – one line for each disclosure.  To create a useful “index” to all the archived disclosures, we use this latter set to create an easier way to compare each separate download.  Although the entire archive is too large to store online, this index will be available.