In the last few years, deep learning approaches using the pretraining/finetuning approach have become state-of-the-art on a number of language tasks. Due to the success of pretrained neural language models, the following question has been raised: to what extent can good general linguistic representations be learned from language modeling alone? One line of research that aims to test this treats pretrained neural language models as linguistic experiment subjects, using the probabilities output by neural models as a proxy for acceptability on linguistic data in minimal pairs. With this approach, I will present tests on data from one particular cataphora study on GPT2, and will also discuss ongoing work in this vein.