Due to their state of the art performance on natural language processing tasks, large neural language models have garnered significant interest as of late. To get a better understanding of their linguistic abilities, linguistics researchers have used the targeted linguistic evaluation paradigm to test neural models in a more linguistically controlled manner. Following this line of work, I am interested in investigating how neural models handle cataphora, i.e. when a pronoun precedes what it refers to (e.g. when [he] gets to work, [John] likes to drink a cup of coffee). I will present work attempting to use stimuli from existing cataphora studies, running and comparing GPT2 results to experimental data. A number of issues arise in comparing to existing studies, motivating a new study to collect data that would better suit the testing of neural models. I show the set up for my pilot experiment, and some preliminary results. I end with some ideas for future directions of this work.