Colloquium Fall 2018

Faculty coordinator: Dr. Jolynn Pek
Venue:
35 Psychology Building
Time:
12:30-1:30pm

August 27, 2018
Introduction and Welcome
*pizza and soda will be served

September 3, 2018
Labor Day

September 10, 2018
Speaker: Danny Avello Fernández
Faculty of Education, Universidad Católica de Chile
Title: Latent Variables in Psychometric Models
Abstract: Latent variables are core concepts in psychometrics. Tests and modelling data from tests are meant to gain information about latent variables. Measuring latent variables is the reason to collect data from tests. Latent variables are interpreted as the traits and cognitives abilities which explain the scores observed in tests. High impact decisions are made based on the estimation of latent variables. Thus it is a valid question to ask whether they are unobserved or unobservable. How much can we known about latent variables from the observed scores? For instance, can we identify the marginal distribution of a latent variable from the observed scores?

In order to address this kind of questions we use the geometry of Hilbert spaces. We worked on the Hilbert space of random variables with finite variance. We call the Hilbert space LaTeX: H. If LaTeX: x and LaTeX: w are two random variables in LaTeX: HLaTeX: E\left(x^2\right)is the length of LaTeX: x (its norm), thus assuming LaTeX: E(x) = 0, the variance can be interpreted as the length of LaTeX: x. It is said that LaTeX: x is orthogonal to LaTeX: w if LaTeX: E(wx) = 0. It also follows that LaTeX: E(x/w) is the orthogonal projection of LaTeX: x into the span generated by LaTeX: w. This means that LaTeX: E(x/w) is the element in the span of LaTeX: w which is closer to LaTeX: x.

We start from the general orthogonal decomposition
LaTeX: y_{ij} = E(y_{ij} \mid\theta_i) +(y_{ij} -E(y_{ij}\mid\theta_i)                                                (1)
where LaTeX: y_{ij} is the observed score of a person i responding to the item LaTeX: j (or test LaTeX: j) and LaTeX: i is the latent variable value of person LaTeX: i. We found that under the weak version of the axiom of local independence (conditional orthogonality instead of conditional independency) LaTeX: E(y_{ij}\mid\theta_i) = \theta_i. Thus, if we denote LaTeX: \varepsilon ε as the term LaTeX: (y_{ij}-E(y_{ij}\mid\theta_i))in decomposition 1, for two measured of the same latent variable, Equation 1 can be rewritten as
LaTeX: y_{i1} = \theta_i + \varepsilon_1                                                                        (2)
LaTeX: y_{i2} = \theta_i + \varepsilon_2                                                                         (3)

Kotlarski (1967) shows that is possible to identify the marginal distribution of LaTeX: \theta LaTeX: \varepsilon_1 and LaTeX: \varepsilon_2 from Equations 2 and 3. We use the estimation procedure developed by Bonhomme et al (2008). We did simulations with gamma, normal, and Bernoulli distributions. For all these distributions we were able to recover the density of the latent variables in Equations 1.

We also explain that a latent variable as in Equations 2 and 3 can be both unobserved and unobservable. Because latent variables represent the substantive concepts behind the data, being unobserved means that these concepts have the same ontological status as the data obtained from the test. We cannot observe the concepts but they ”live”in the data.

 

September 17, 2018
Speaker: Dr. Yan Zhang
Department of Biomedical Informatics, The Ohio State University
Note location change: PS 219
Title: Genome structural variations and their functional impact
Abstract: Genome structural variations (SVs) not only generate genetic diversity but also can cause diseases. High-throughput sequencing technologies have been used to identify SVs from various genomes. Retroduplications are a special type of SVs. A fate of novel retroduplications is the generation of fusion genes when the retroduplication inserts into a coding gene. For example, the UQCRH gene is interrupted in some soft-tissue sarcoma and contains processed pseudogenes in its genomic structure. UQCRH is a novel prognostic factor for hepatocellular carcinoma. In this talk, I will first introduce the basic types of SVs and our studies on their functional impact in general populations. Then I will talk about the tools we developed for retroduplication identification and our studies on inferring the functional impact of novel retroduplications in disease context.

September 24, 2018
Speaker
: Dr. Alex Petrov
Department of Psychology, The Ohio State University
Title: An epistemological justification of Ockham’s razor
Abstract: Simple models are preferable to complex ones, all else being equal. What is the justification of this principle? Many prominent thinkers throughout history offered metaphysical and/or theological justifications. For example, Newton wrote in his _Principia_ (1713): “Nature does nothing in vain, and more causes are in vain when fewer suffice. For nature is simple and does not indulge in the luxury of superfluous causes.”  Elsewhere Newton also wrote: “It is the perfection of God’s works that they are all done with the greatest simplicity. He is the God of order & not of confusion.”

In this talk, I am going to survey various theoretical results from statistics (e.g., bias-variance tradeoff) and statistical learning theory (e.g., structural risk minimization), as well as empirical results from machine learning. On this basis, I’m going to argue that Newton was wrong. Ockham’s principle of parsimony does not reveal that the world is simple, but merely that our epistemic access to the world is limited.

October 1, 2018
Speakers: Dr. Longjuan Liang, Senior Psychometrician at Educational Testing Service
Speakers:::: Dr. Nuo Xi, Senior Psychometrician at Educational Testing Service
Speakers:::: Dr. Jim McGinley, Director of Behavioral Analytics at Vector Psychometric Group
Speakers:::: Dr. David Kriska, Personnel Psychologist (retired), City of Columbus
Speakers:::: Ty Henkaline, Managing Director of Data and Insights, Singularity University
Title: Panel Discussion: Careers in Industry for Quantitative Methodologists
Abstract: Quantitative psychology is a broad field encompassing mathematical modeling, research design and methodology, and statistical analysis of psychological data. This scientific field traces its roots to the study of human (mental) abilities and psychological measurement, giving birth to psychometric models (e.g., factor analysis and item response theory). The interdisciplinary nature of quantitative psychology is evident in its broad impact in social science research, including the disciplines of education, public health, and data analytics. This panel discussion of experts in these fields, who have graduated from quantitative psychology programs, will focus on describing their experiences working in industry.

October 8, 2018
Speaker: Dr. David Melamed
Department of Sociology, The Ohio State University
Title: The Structure of Human Social Networks Promotes Prosocial Behaviors
Abstract: Understanding the evolution of prosocial behaviors – cooperation and unilateral flows of resources – remains an important interdisciplinary scientific problem. On the one hand, prosocial behaviors are an evolutionary paradox since they entail decreasing one’s own fitness to benefit another. On the other, understanding the mechanisms that promote prosocial behaviors enable their prediction, making this a key scientific problem as well. Some of the key mechanisms that have been identified to promote prosocial behavior are inherently network or relational phenomena. Direct reciprocity arises in dyads, and both generalized and indirect reciprocity arise in broader network structures. However, these processes are typically studied in isolation, and the role of broader network topology is not understood. Here, we use an agent-based model to investigate how social networks shape prosocial acts. The behavior of the agents is driven by results from a web-based experiment, and the networks that define their relations are derived from generative models of the ten largest friendship networks in the National Longitudinal Study of Adolescent to Adult Health. We find that properties of real-world networks are indeed related to rates of prosociality. In particular, homophily and transitivity increase types of reciprocity, which, in turn, increase prosociality. That is, we demonstrate that the formal properties of human social relations promote prosocial behavior.

October 15, 2018
Speaker:  Dr. Robert Gore
Department of Psychology, The Ohio State University
Title: A Beta Adjustment to the Starting Point of the Linear Ballistic Accumulator Model Accounts for Automatic Facilitation
Abstract: A repetition effect for response time (RT) in a two-alternative forced choice task has been observed for over sixty years. The RT drops with repetition of the stimulus as long as the interstimulus interval is relatively short. No contemporary mathematical model of response times efficiently accounts for this effect. Most models assume that responses are identically and independently distributed (i.i.d.) By implementing a beta distributed starting point in Brown and Heathcote’s Linear Ballistic Accumulator Model, the repetition effect can be explained by an embedded beta-binomial model of across trial evidence accumulation. However, other adjustments are required for identification and tractability. I will present the derivation of a closed-form solution to the joint distribution of choice and response time and preliminary simulation results. The beta-adjusted LBA accounts well for the historical pattern of findings when stimuli are sometimes repeated and participants respond under speeded conditions.
Discussant:
Nicholas Rockwood

October 22, 2018
Speaker: Dr. Jason Hsu
Department of Statistics, The Ohio State University
Title: Errors in Multiple Testing Big and Small, Now and Then, More or Less
Abstract: All things are connected, old and new, here and there, this and that.

In the late 1980s, multiple comparisons moved from controlling the Experimentwise Type I error rate (under the scenario that all nulls are true) to controlling Tukey’s Familywise Type I error rate. I will explain the difference and the original reason for this switch, so that this important lesson is not forgotten in solving today’s problem of targeting immunotherapy cancer patients.
In the mid 1990s, bias from marginal means multiple comparisons (Means in Proc GLM) in linear models got corrected by switching to least squares means multiple comparisons (LSmeans in Proc GLM and Proc Mixed). Around 2010, LSmeans multiple comparisons got extend to binary and time-to-event outcomes, in SAS and in R. This extension turned out to not have been entirely thought through, giving incorrect results for odd ratios and hazard ratios. Personalize/precision medicine targeting subgroups of patients revealed this issue recently. Fortunately, multiple comparisons for relative response and ratio of medians which are subgroup mixable have just been developed.

Since the early 2000s, genome-wide association studies (GWAS) have become popular. A typical academic GWAS might assess which of a million single-nucleotide polymorphisms (SNPs) are associated with a phenotype (such as early onset Alzheimer’s Disease). Turns out, even though biologically the vast majority of SNPs would not be associated, statistically all SNPs will pick up some effect if there is just one causal SNP. Tukey was right: all Null null statistical hypotheses are false (so Type I error rate control is not interpretable). This startling fact was revealed by pharmaceutical GWAS which, in contrast to academic GWAS, involve medicine. Fortunately, GWAS formulated as a simultaneous confidence intervals problem for assessing “clinically meaningful effect” has no such issue, as I will indicate.

We are all connected, by time, place, and people; I thus dedicate this presentation to Michael Browne.

October 29, 2018
Speaker: Jack DiTrapani
Department of Psychology, The Ohio State University
Title: Modeling Extreme Response Styles in Behavioral Genetics Using IRTrees
Authors: Jack DiTrapani, Nicholas J. Rockwood, Minjeong Jeon
Abstract: Behavioral genetics studies often employ the ACE model to decompose the variance of a latent variable into variances due to genetic and environmental influences. The manifest variables for these types of analyses are often categorical in nature (e.g. item responses). In such circumstances, an item response theory (IRT) model can be incorporated into the ACE model as the measurement model. Standard IRT models do not typically account for different response style tendencies, such as extreme response style (ERS), which is a given respondent’s propensity to endorse item categories at a scale’s endpoints regardless of the underlying trait being measured. Work utilizing item response trees (IRTrees; De Boeck & Partchev, 2012) to model ERS tendencies has shown that ignoring ERS can alter conclusions reached on the latent trait under investigation (Bockenholt, 2017). However, ERS has yet to be explored using IRTrees in the context of ACE models, which is the focus of the present research. Specifically, we use an IRTree to model the latent trait of interest and a latent ERS trait. The variances of these latent variables are decomposed into genetic and environmental factors. The utility of this research is twofold. First, we obtain estimates of the proportions of variance in ERS and the substantive latent trait due to genetic and environmental influences. Second, we compare the genetic and environmental influences in the trait of interest when the IRTree model is used, relative to the model that does not control for ERS.  
Discussant:
Jacob Couts

November 5, 2018
Speaker: Dr. Brooke Magnus
Department of Psychology,  Marquette University
Authors: Brooke Magnus and Yang Liu
Title: A Zero-Inflated Box-Cox Normal Unipolar Item Response Model for Measuring Constructs of Psychopathology
Abstract: This research introduces a latent class item response theory (IRT) approach for modeling item response data from zero-inflated, positively skewed, and arguably unipolar constructs of psychopathology. As motivating data, we use 4,925 responses to the Patient Health Questionnaire (PHQ-9), a nine Likert-type item depression screener that inquires about a variety of depressive symptoms. First, Lucke’s log-logistic unipolar item response model is extended to accommodate polytomous responses. Then, a nontrivial proportion of individuals who do not endorse any of the symptoms are accounted for by including a nonpathological class that represents those who may be absent on or at some floor level of the latent variable that is being measured by the PHQ-9. To enhance flexibility, a Box-Cox normal distribution is used to empirically determine a transformation parameter that can help characterize the degree of skewness in the latent variable density. A model comparison approach is used to test the necessity of the features of the proposed model. Results suggest that (a) the Box-Cox normal transformation provides empirical support for using a log-normal population density, and (b) model fit substantially improves when a nonpathological latent class is included. The parameter estimates from the latent class IRT model are used to interpret the psychometric properties of the PHQ-9, and a method of computing IRT scale scores that reflect unipolar constructs is described, focusing on how these scores may be used in clinical contexts.

November 12, 2018
Veteran’s Day

November 19, 2018
Speaker: Dr. Lijun Cheng
Department of Biomedical Informatics, The Ohio State University
Title: Intelligence Precision Medicine- biomarkers, targets, and drugs
Abstract:
Background.
 Data integrated cell lines and tumors can then be used to link cellular genomic features with patients, where the ultimate goal is to build predictive signatures of patient outcome. Characterizing key genetic alterations in cancer cells and discover therapeutic targets for patients is precision cancer medicine major goal. With the development of multi-omics data, it becomes urgent to monitor consensus module patterns of multi-omics data under various biological conditions both in tumors and cell lines, especially related druggable target control. The generalization demands new computational biclustering solutions to address the growing volume of different ‘omics’ levels for these molecular functional module finding.
Method. A fast biclustering Bi-EB algorithm is developed to detect the local pattern of integrated multi-omics data both in cancer cells and tumors. Bi-EB adopts a data driven statistics strategy by using Expected-Maximum (EM) algorithm to extract the foreground bicluster pattern from its background noise data in an iterative search. Recovery and Relevance scores are used to evaluate Bi-EB model accuracy by comparing its result with seven popular bicluster methods on simulated constant, row and column shift-scaled bicluster data. Based on mRNA and protein expression profiles, co-regulated local gene patterns on subgroup of breast cancer cells and tumors, for the first time, are detected systematically.
Result. Simulation results show Bi-EB keep a higher recovery and relevance comparing with all seven biclustering both in scale-shifted row and column biclusters, and especially outperforms to algorithms CC, spectral and xMotif for constant biclusters searching. Bi-EB is applied in luminal-A and basal-like subtype of breast cancer for co-regulation module search of druggable target mRNA/protein across patients and cancer cells.
Conclusion. Transparent probabilistic interpretation and ratio strategy for omics data is first time proposed to detect the co-regulation patterns. Bi-EB is applied in co-expressed mRNA/protein patterns identification for luminal-A breast cancer, the accordance results with clinical practice further proved Bi-EB algorithm reliability.

November 26, 2018
Speaker:  Selena Wang
Department of Psychology, The Ohio State University
 Title: Latent space models for heterogeneous multi-mode networks
Abstract: Social relationships influence individual outcomes, dementia (Fratiglioni, et al., 2000), decision making (Kim & Srivastava, 2007), adolescent smoking (Mercken, et al., 2010), online behavior choices (Hazen Kwon, et.al., 2014). In many studies, both a friendship network among the subjects and a network of item responses are available. Therefore, it can be useful to jointly analyze information given by both network matrices. For this purpose, we introduce latent space models for heterogeneous multi-mode networks (LSHMM) that combine latent space models (Hoff et al., 2002) with multidimensional two-parameter item response models (Reckase, 2009). This model merges information given by multiple networks with multiple types of nodes. We develop a variational Bayesian Expectation-Maximization algorithm to perform posterior inference in the model. Simulations are performed to demonstrate the efficacy of the methods.
Discussant:

December 3, 2018
Brief presentations on external talk
*pizza and soda will be served
Nicholas Rockwood
Selena Wang
Diana Zhu
Saemi Park
Bob Gore
Yiyang Chen
Ivory Li

Robert Wherry Speaker Series
Colloquium Archive