Jekyll2020-06-24T08:15:44+00:00https://www.thomasleeper.com/feed.xmlThomas J. LeeperThomas J. Leeper, PhD ~ Behavioral Scientist and R HackerThomas J. LeeperPeer-Improved, Not Peer-Reviewed, Journals2018-09-16T00:00:00+00:002018-09-16T00:00:00+00:00https://www.thomasleeper.com/2018/09/peer-improved-journals<p>An interesting aspect of contemporary science is that the fast-paced development of new research combined with alternative platforms for disseminating findings (e.g., blogs and social media) mean that research frequently gets exposure, feedback, criticism, praise, journalistic attention, citations, and even replication long before it passes through a traditional peer review process. As <a href="https://thepoliticalmethodologist.com/2015/12/21/the-multiple-routes-to-credibility/">I’ve written before</a>, this means that research is effectively being peer reviewed more quickly and more broadly than ever before. Rather than peer review coming primarily through the institutionalized form of anonymous vetting by anonymous reviewers, peer review now entails public, discursive, and cumulative engagement with research. Many have criticized this reviewing transition for being too fast, too open, too informal, and too critical. Those criticisms aside, fast, public peer review seems here to stay. My question today is therefore: what does this mean for traditional journals?</p>
<p>While we can continue to pretend that academic journals remain the definitive source of the scientific record, books, preprints, working papers, conference presentations, blog posts, and even ephemeral communication forms like gists, github repos, and tweets increasingly contain insights that become vital to collective knowledge. Journals were created to solve coordination problems around the sharing of scholarly insights; other platforms now provide equally viable means of research communication. Journals introduced peer review to vet unfamiliar and thus potentially untrustworthy authors; fast, public peer review provides the same without regard to authorship and with the perk of public accountability. Fast, public peer review hasn’t replaced traditional journals and their traditional review processes but it presents a parallel and quite different means of checking the validity of research.</p>
<p>Journals therefore provide two functions - communication and vetting - that are rapidly being solved by other means that are quicker, supported by broader and more diverse research communities, and less expensive that the for-profit publishing models that have been built-up around prestigious academic journals. Amidst this competition, journals need to clarify their purpose in a radically faster, more public scientific world. They need to adjust their mission in these changing times and demonstrate value beyond communication and vetting. Journals need a new role in science and new processes and incentive structures to support that role.</p>
<p>What I have in mind is this: the peer reviewed journal is dead or at least dying. If not for incentives that tie publication in academic journals to promotion, journals would have died as soon as their communicative function was replaceable. Journals discourage citations to preprints and working papers, in part, because publishers know that a world where such forms of scientific reporting count equally to articles is a world where journals are expensive legacy platforms. Similarly, top-tier general journals typically discourage sharing preprints pre-publication as they desire to reiterate their premier position as venues for communicating research. These types of policies aim to prevent the loss of market share without any meaningful innovation.</p>
<p>An academic journal therefore needs to be something other than its minimal type: it needs to do more than communicate and vet. Let’s reason a bit by analogy. Companies often die because they lose sight of what their product is. The now-classic example is Kodak. Kodak went bankrupt because they thought they were in the business of film. But digital camera technology disrupted the film camera world. By thinking they were selling film, Kodak missed the underlying reality that they were selling a means of sharing memories. How best do you share memories? Once upon a time it was silver-plate photographs taken by a professional photographer, then it was black-and-white photographs taken on a camera, then it was color photographs taken on a camera, but then it quickly became film and digital and photo-sharing websites and beyond. Kodak thought they sold films but actually they sold something else and other companies sold the public technologies to save and share memories better than the thing Kodak thought was their core product.</p>
<p>That might seem like an oblique analogy but it gets at the core of what I want to argue: journals do not sell communication and vetting, or at least not just that anymore. Rather, journals sell curation. Other people also sell curation. I sell curation: I tell people to read things. <a href="https://twitter.com/johnholbein1">John Holbein</a> sells curation: he tweets new social science papers that people should read. Anyone can sell curation but journals have a reputation for doing it. People read journals not just because that’s where research is or that’s where research has been vetted. They read journals because there’s a fairly widely held belief that <em>good</em> research is published there; that limited time spent consuming research should be spent there rather than elsewhere. Journals sell search filters, not peer review.</p>
<p>“Okay, Thomas, that’s great but this TED Talk is getting a bit long,” you’re saying. You’re right. Let’s wrap this up by answering my initial question: what does this mean for journals? I think it means that journals need to pivot. Journals are structured entirely around the peer review process: accept arbitrary submissions, subject them to editorial and/or peer review, allow revision of some, and publish a tiny fraction of the total. But in a world where journals are recognized as being about filtering rather than reviewing, I think journals should rebrand themselves and restructure their internal operations. Here’s how:</p>
<ol>
<li>Journals should see themselves as “peer-improved” rather than “peer-reviewed”. Most research is heavily reviewed and, ultimately, individual scientists will disagree on how science should be practiced and where something is good enough. Journals should therefore focus on rapid desk rejection of papers they don’t think are improvable. Rather than rejecting because something is too new, journals should judge research on its potential. Such a frame of reference naturally invites results-blind review, preregistration, replication, and other routes to research credibility.</li>
<li>Journals should recognize and cultivate the communities of authors, reviewers, and readers they support. Peer-improved journals should first and foremost be useful. They should help improve the research performed by their contributors. They should help ensure that their community receives the research they need to do their work. Metrics of community well-being should replace metrics of citation counts and impact factors; these traditional metrics should follow naturally from healthy journal communities. Shifting metrics will help accommodate different forms and lenghts of contribution without publishers worrying about increasing the denominator by which citation counts are divided.</li>
<li>“General” journals should be given much less recognition than they currently receive. If peer-improved journals primarily curate and filter, while providing inclusive scholarly communitities, then there’s almost no research that belongs in a discipline-spanning outlet. That is a sign that the work is not especially relevant to the incremental work done by the research community surrounding an applied problem. Obviously, field-spanning broad journals have a purpose in a world of small communitities but academic incentives should be aligned to publishing the best research in the most appropriate places rather than pretending that generality is a synonym for quality.</li>
<li>The work of peer-improved journals should be to improve science. Reviewers should be invested in the idea that their work is to make research outputs better, not put them or their authors down. Reviewers should be incentivized and recognized for these positive contributions. Peer-improvement is a task of encouragement, mentoring, and feedback, not a purely critical dismissal of authors’ effort and capability.</li>
<li>Even more radically, we may need to think about journals themselves as ephemeral. Biopolitics could have benefitted from a journal in the mid-2000s where scholars worked together to improve research on genetic bases of political orientations but with the literature moving on, such a journal feels less useful than it was a decade ago. Field experimentalists could have benefitted from a well-focused effort in the early 2000s before the method became diffuse enough that it was relevant to a larger number of communities. Polarization researchers could probably use a journal now to quickly develop questions, methods, and data, but in a decade’s time it’s hopefully the case that it wouldn’t be particularly useful to continue. Journals should come and go as needed.</li>
</ol>
<p>These are some of the key ways that I think disrupting academic journals could be useful by rethinking their purpose in a world where their superficial purposes are no longer unique. Ultimately, in a world of fast, public peer review, we don’t need journals to communicate science and we don’t need the current peer-review processes they provide to improve science. We can achieve both through other means, more quickly, more cheaply, more inclusively, and more effectively. What we do need are filters and journals are well-positioned to provide those filters, provided they are also willing to match curation with cultivation, review with investment, and criticism with constructive feedback. And we need universities to recognize the value to humanity of individual efforts to improve science, not just understand science as one-off articles landed in high-prestige outlets.</p>Thomas J. LeeperAn interesting aspect of contemporary science is that the fast-paced development of new research combined with alternative platforms for disseminating findings (e.g., blogs and social media) mean that research frequently gets exposure, feedback, criticism, praise, journalistic attention, citations, and even replication long before it passes through a traditional peer review process. As I’ve written before, this means that research is effectively being peer reviewed more quickly and more broadly than ever before. Rather than peer review coming primarily through the institutionalized form of anonymous vetting by anonymous reviewers, peer review now entails public, discursive, and cumulative engagement with research. Many have criticized this reviewing transition for being too fast, too open, too informal, and too critical. Those criticisms aside, fast, public peer review seems here to stay. My question today is therefore: what does this mean for traditional journals?The First Mistake in Crafting Survey Experiments2018-03-27T00:00:00+00:002018-03-27T00:00:00+00:00https://www.thomasleeper.com/2018/03/survey-experiment-first-mistake<p>In the past few years, I’ve been doing a lot of teaching on the topic of <a href="https://www.thomasleeper.com/surveyexpcourse">survey experiments</a>, motivated by the dearth of accessible guides to conducting survey-experimental research aside from <a href="https://press.princeton.edu/titles/9620.html">Mutz’s (2011) <em>Population-Based Survey Experiments</em></a>. The way I teach this course is being first outlining the logic of experimental design in general and then walk participants through how to use survey experimental manipulations to operationalize variations in potentially causal variables. The starting point is alway theory, from which manipulations and outcome questions are derived. The reason for teaching this way is that first-time experimenters frequently have a survey background (or no background at all in empirical research) and believe that the starting point of a survey experiment is the questionnaire. This is a first and often final, fatal mistake. (It is trivial whether this approach takes place in a word document or in survey software, the error is starting with a questionnaire at all.) Aside from the fact that starting from a questionnaire is an error-prone preparation for fieldwork that is likely to introduce trivial, easily missed typographical errors, the decision to design the survey before designing the experiment is very likely to lead a new experimenter astray.</p>
<p>There are three main reasons why starting from a questionnaire is flawed. The first is that the naive questionnaire-first approach to survey-experimental design presupposes that the first manipulations one constructs are the most suitable for any particular experimental test. For example, I may want to do an experiment on citizens’ opinions toward a specific policy and assess whether framing of that issue affects opinions, so I start by finding articles that generally use two frames that I’m interested in and then plunk those down in a Word document followed by some questions measuring opinions. In doing so, I’ve fixated myself on a particular style of manipulation of framing and also made assumptions about what is - and what is not - held constant across experimental conditions. Why did I choose to use articles as opposed to some other stimulus? Why did I choose these articles? In what ways do they vary aside from use of the frames I was interested? Are these representative of framing treatments generally? Why have I chosen articles at all, as opposed to framed question wordings or visual stimuli or a question-ordering manipulation? Do these stimuli have characteristics that are appropriate for testing my theory? Oh wait, I haven’t written out a theory so the last of these questions is unanswerable.</p>
<p>Good experimental design starts with clearly stated set of empirical expectations that are expressed <em>independent of the manipulations being used</em>. If I am interested in the effects of framing on opinion, I need to state that expectation without regard to the particular device I have used to operationalize “framing” and without regard to the particular questions I have used to measure the outcome. We cannot evaluate construct validity of the manipulation and the outcome measure if the theory does not exist independent of the measures. If I start with a questionnaire, I inevitably express my expectations in a way that is constrained by my naive intuition about what feels like a good questionnaire rather than what is a theoretically interesting hypothesis. This initial questionnaire-independent theoretical statement typically also benefits from explication of assumptions being made either in the theory or in its general operationalization. For example, perhaps I envision explicit scope conditions for my theory (such as ambition to speak to specific types of people, specific points in time, or specific issues). Stating these up front is a form of pre-registration of my ideas (with myself) that prevents me from later getting lost in trying to explain whatever results emerge from the study with data independent of design. Starting from an explicit theoretical statement helps to avoid fishing through the data or, worse, fishing through explanations for those data.</p>
<p>The second problem with starting a survey-experimental project with a questionnaire is that it inevitably leads to a questionnaire that is filled with extraneous material. A survey-experiment contains at its most essential level only two things: a manipulation of an independent, putatively causal variable and the measurement of an outcome. Most survey experiments thus contain about two items or two versions of a single question. Yet the questionnaire-first approach to survey-experimental design often yields questionnaires with much more material. I think the reason for this stems primarily from thinking about survey-experiments as if they were observational surveys. In an observational analysis of survey data, I might be particularly interested in descriptive patterns in the resulting data such as demographic variation in outcomes. This may be also be my intention in survey-experimental design but those analyses serve a different purpose from testing my core causal hypothesis - they are frosting or they are fodder for a different empirical project.</p>
<p>When I start from a theoretically motivated empirical expectation, my questionnaire typically generates a minimum number of items - that is, just the essential core. Anything else I add needs to be justified in terms of a specific analysis that each additional item will be used for:</p>
<ul>
<li>Why am I measuring demographics or media exposure? Is it to assess pre-treatment covariate balance? If so, why am I doing that analysis? What will I learn from it? What will I do in response to evidence of balance or imbalance?</li>
<li>Why am I measuring demographics or media exposure? Is it to assess treatment effect heterogeneity? Why? What kind of heterogeneity do I expect? If I expect heterogeneiety, why didn’t I theorize it or why didn’t I manipulate its source? If I plan to search post-hoc for heterogeneiety, what will I do with those findings? How - if at all - will I communicate them?</li>
<li>Why am I measuring demographics or media exposure? What is it for? Is it for assessing effect heterogeneity or something else? Do I just want to know about my sample or do I want to characterize whether it is typical of my population of interest? What evidence would constitute typicality or representativeness, or the lack thereof? What will I do in response to evidence of representativeness or lack thereof?</li>
<li>Why am I measuring pretreatment opinions? Is it to improve measurement precision by conducting a within-subjects design? Will it generate consistency biases? Will it affect respondents’ understanding of the study?</li>
<li>Why am I measuring attention to or recall of the stimulus material? Is this a manipulation check? Is it something for my own benefit? Do I intend to instrument for this using random assignment in a two-stage least squares analysis? What will I do if the check “fails”?</li>
<li>Why am I measuring other outcomes other than opinion? Is it that I’m worried I won’t find significant results? Is it that I have a more extensive theory than the one I mentally articulated? Is it that I plan to p-hack or perhaps setup some exploratory data analysis for future work? Is it that might control for post-treatment outcomes or attempt some form of mediation analysis? Have I considered the assumptions necessary for analyzing data in this way?</li>
</ul>
<p>In essence, all of these questions ask the researcher to reflect upon why they are doing things in their survey instrument. This is something they would certainly do if they were designing a survey - time is precious, so it is important to only measure what will actually be used. In survey-experimental work, it is important to be even more stringent about deciding what to measure because ultimately the analysis typically relies only upon the core items of the questionnaire (the manipulation and the outcome question or questions). Other items might serve some purpose but design-based analysis of experiments does not make obvious what those purposes might be. That’s not to say there is no room for such analyses - quite the opposite - but the logic of performing such analyses does not follow from a simple analysis of a two-condition experiment in order to speak to the kind of simple theoretical statement used here.</p>
<p>Analysis aside, survey experimental stimuli are typically designed to generate maximum variation in the putatively causal variable such that immediate effects can be measured using outcome questions soon thereafter. The greater the number of superfluous items, the more noise is introduced into the design; this is not between-condition noise that generates confounding but rather noise that makes those results more constrained or conditional. Such constraint might be because we want respondents to be a given mindset during the experience of treatment but noise introduced for purposes other than that form of control is not necessarily useful because it may make the results even more local than they otherwise would be (and thus increase the apparent magnitude of results or diminish magnitudes, and we can never know which is which).</p>
<p>The final problem with a questionnaire-first approach is that ignores the connection between survey measurement and experimental data analysis. While a particular question might seem - for reasons of face validity - to be the appropriate measure of an outcome concept, decisions about how to measure outcomes need to reflect an intersection between respondent comprehensibility and data quality. Survey experimental analysis necessarily involves the analysis of a question-operationalized variable or set of variables. If we follow recent advice within political science to mostly analyze experimental data using ordinary least squares regression, then we need to consider how the questionnaire we are designing will lead to an outcome measure that can be sensibly analyzed in this way. In particular, outcomes that might have some intuitive appeal - like rankings or qualitative categorical response questions - do not lend themselves naturally to an OLS-centric analysis (or they lend themselves to analyses that require considerable researcher discretion). Thinking ahead to the analysis, the researcher should select questionnaire items that enable the kind of analysis that is likely to be performed.</p>
<p>In the same way, the questionnaire-first approach runs the risk of introducing excessive complexity into designs as further conditions are added in the face of considerations about interesting aspects to vary or control. In the framing example used above, only two or perhaps three conditions are necessary to gain insight into the framing effect. The decision to introduce further conditions should be theoretically motivated rather than introduced by piecemeal variations on the current version of the questionnaire. If we think again about the form of analysis that might take place with the resulting data, a two-condition experiment lends itself to only one obvious OLS-based analysis but a three-condition experiment allows for substantially more variations on a theme:</p>
<ul>
<li>Outcome as a function of control as baseline, with treatment 1 and treatment 2 introduced as additional factors</li>
<li>Outcome as a function of treatment 1 as baseline, with control and treatment 2 introduced as additional factors</li>
<li>Outcome as a function of treatment 2 as baseline, with control and treatment 1 introduced as additional factors</li>
<li>Outcome as a function of control as baseline and treatment 2 as an additional factor, omitting treatment 1</li>
<li>Outcome as a function of control as baseline and treatment 1 as an additional factor, omitting treatment 2</li>
<li>Outcome as a function of treatment 1 as baseline and treatment 2 as an additional factor, omitting control</li>
<li>Outcome as a function of treatment 2 as baseline and treatment 1 as an additional factor, omitting control</li>
<li>Outcome as a function of control as baseline, merging treatment 1 and treatment 2 as an additional factor</li>
<li>Outcome as a function of merged treatment 1 and treatment 2 as baseline, with control as an additional factor</li>
<li>Outcome as a function of treatment 1 as baseline, merging control and treatment 2 as an additional factor</li>
<li>Outcome as a function of merged control and treatment 2 as baseline, with treatment 1 as an additional factor</li>
<li>Outcome as a function of treatment 2 as baseline, merging control and treatment 1 as an additional factor</li>
<li>Outcome as a function of merged control and treatment 1 as baseline, with treatment 2 as an additional factor</li>
</ul>
<p>While there is a mathematical equivalence in many of these parameterizations and all are ultimately based upon combinations and comparisons of the three treatment group means, the insights that are immediately and obviously obtained from alternative parameterizations are likely to vary dramatically. The decision about which of these parameterizations is of sole or primary interest is something that should be decided at a theoretical level in order that the design speaks to the anticipated empirical regularities in a straightforward manner. If we start with a questionnaire rather than an experimental design, that decision about what matters may not be made until later or perhaps never at all.</p>
<p>Therefore, rather than begin with a survey questionnaire as the starting point for an experimental project, researchers should always start from a clearly articulated empirical expectation situated within a detailed protocol document or pre-analysis plan (regardless of whether that plan is registered). Starting with a general or abstract approach will help to clarify what is intended to be manipulated and measured and possibly motivated pilot testing or at least reflection upon the range of possible manipulations of the core concepts before defaulting to one that is obvious. This approach will ultimately enable succinct experimental designs that can be clearly communicated to audiences and to one’s future self without the burden of getting bogged down in the messy data that a questionnaire-first survey-experimental design is likely to generate.</p>Thomas J. LeeperIn the past few years, I’ve been doing a lot of teaching on the topic of survey experiments, motivated by the dearth of accessible guides to conducting survey-experimental research aside from Mutz’s (2011) Population-Based Survey Experiments. The way I teach this course is being first outlining the logic of experimental design in general and then walk participants through how to use survey experimental manipulations to operationalize variations in potentially causal variables. The starting point is alway theory, from which manipulations and outcome questions are derived. The reason for teaching this way is that first-time experimenters frequently have a survey background (or no background at all in empirical research) and believe that the starting point of a survey experiment is the questionnaire. This is a first and often final, fatal mistake. (It is trivial whether this approach takes place in a word document or in survey software, the error is starting with a questionnaire at all.) Aside from the fact that starting from a questionnaire is an error-prone preparation for fieldwork that is likely to introduce trivial, easily missed typographical errors, the decision to design the survey before designing the experiment is very likely to lead a new experimenter astray.My First Project and Most Recent Publication2017-09-16T00:00:00+00:002017-09-16T00:00:00+00:00https://www.thomasleeper.com/2017/09/my-first-project<p>Academia is a marathon. While it can often feel like a sprint, with deadlines fast approaching and crowds running past you, ultimately everything takes more time than you first think. Case in point: my first ever graduate research project way back in 2010, which ultimately became a chapter in my dissertation which I finished in 2012, was finally accepted for publication last year in <a href="https://doi.org/10.1017/XPS.2017.1">the <em>Journal of Experimental Political Science</em></a> and is online as of September 14, 2017. I never thought it would take eight years to see this research come to light, nor did a much more naive and much younger version of me imagine that it would go through a series of (sometimes painful) rejections. Yet the experience of carrying this project from a nascent idea in the back of the mind of an early graduate student through to publication in a peer-reviewed journal fundamentally transformed how I think about science and specifically <em>open science</em>. Here’s the story of this paper.</p>
<p>I started graduate school in 2008, fresh out of undergraduate at the University of Minnesota with the goal of studying something vaguely at the intersection of political theory, history, and political psychology. Actually, to give you a taste of my mindset, here’s an excerpt from my graduate admissions essay that somehow got me into Northwestern:</p>
<blockquote>
<p>I have become interested in exploring the use of selective histories, political myths, and generational analogies as persuasive tools for priming, framing and agenda-setting in elite rhetoric and as frames in which voters construct attitudes toward current events. I hope to research how the relationship between elite-level and media interpretations of historical events both retell narratives of shared experience and create tension between different ideologies and frames.</p>
</blockquote>
<p>To be clear: I never did anything on “selective histories, political myths, or generational analogies”. I did, however, remain interested in elite political communication and its psychological effects. But two years of coursework got me much more interested in the question of citizen competence and the measurement of political knowledge in particular. I spent much of 2009 and 2010 trying to think up a way to cleverly demolish the existing literature on political knowledge. To convey how long I thought about that, a visual of my version control system:</p>
<p><img src="https://i.imgur.com/YkK9mon.png" alt="prospectus-files" /></p>
<p>Version control was not my strong suit. The first of those drafts proposed that I spend my dissertation trying to understand the following:</p>
<blockquote>
<p>What is political knowledge and why does it matter? To answer this question I pose three specific research questions and experimental designs to address them: (1) on the dimensionality of political knowledge, (2) on the acquisition of political knowledge, and (3) on the implications of political knowledge for attitude formation and decision-making.</p>
</blockquote>
<p>As it would turn out, only the second of those three questions ever became part of my dissertation - indeed, the question of how citizens acquire information is ultimately the only thing I actually studied in my dissertation research.</p>
<p>The first idea for the dissertation was simple: how does attitude strength change the way that people engage with political information? This seemed important because the literature on political communication at the time (and still to this day) adopts one of two paradigms: either researchers randomly expose participants to messages and see what happens or they let people choose information and study the choices per se. My idea was simple: what if we compared those two approaches and looked at the downstream effects, and given that attitude strength might change both what kind of information people acquire and how they respond to it, let’s throw that into the mix as well. What came out was a relatively simple survey experiment, which I was able to piggyback onto another survey that my advisor, <a href="http://faculty.wcas.northwestern.edu/~jnd260/index.html">Jamie Druckman</a>, was already doing in the spring of 2011. It took two years to come up with this first dissertation project idea and about six months to design it, field it, and get the data processed.</p>
<p>Eventually, it made its way into my dissertation and helped me to conceptualize and design the subsequent two empirical chapters. Oddly enough, the other two papers came out first: <a href="https://doi.org/10.1017/S0003055412000123">one in the <em>American Political Science Review</em> in 2012</a> and <a href="https://doi.org/10.1093/poq/nft045">the other in <em>Public Opinion Quarterly</em> two years later</a>. “Chapter 1” as it lovingly remained known in my Dropbox folder lingered on. It failed several places: <em>American Journal of Political Science</em>, <em>The Journal of Politics</em>, and <em>Political Behavior</em>. I abandoned it for a while.</p>
<p>Then, the <em>Journal of Experimental Political Science</em> came around with a first issue published in 2014. I submitted it in late 2015, received an R&R, resubmitted in mid-2016, received a conditional acceptance, some stuff happened at the journal, and I received confirmation it would be published in August of 2017. It is now online. It’s been a long history.</p>
<p>What have I learned from this? First, everything takes time. Coming up with compelling research takes time. Data collection takes time. Data analysis takes time. Writing takes time. Peer review takes time. Rejection takes time. Recovery from rejection takes time. Responding to reviewers takes time. Typesetting takes time. Email takes time. Writing this blog post takes time. Everything takes time. It’s been eight years but that time has brought a publication I’m quite proud of.</p>
<p>But that’s a rather trivial thing to have learned. More importantly, I learned a few things about science and a few things about what I would like the process of science to look like:</p>
<ol>
<li>
<p>I learned that peer review processes are often harsh and heartbreaking. This was my first paper, my academic brainchild. People trashed it. In retrospect, often that trashing contained reasonable feedback but it was couched in language that sometimes dismissive and hurtful. On the other hand, sometimes the feedback was immensely helpful. Editors at both <em>Political Behavior</em> (shout out to Jeff Mondak) and <em>JEPS</em> (shout out to Eric Dickson) went out of their way to help me improve the paper and think about how to get it into publication. From those experiences, I learned that reviewing needs to be serious but it also needs to be kind. That’s why I started <a href="http://thomasleeper.com/2016/08/be-reviewer-one/">#BeReviewer1</a>.</p>
</li>
<li>
<p>I learned that I was disorganized. Once I’d written a few dozen drafts of this paper, anonymizing it and formatting for one journal, then deanonymizing it and reformatting it for the next, it became clear that my project management skills were a mess. I got interested in version control software and the basic idea of “getting yourself organized” as an essential part of any scientific project. I learned to start projects differently, using a standardized file structure and to never label files “version 1”, “version 2”, “final”, etc. I increasingly use git to track changes to projects and I learned that it’s important to nudge students and collaborators to the do the same.</p>
</li>
<li>
<p>I learned that statistical methods - like the ones used in the paper - do not get used unless there’s software to make them happen. Around the time I was analyzing my data (in 2011, specifically), Brian Gaines and Jim Kuklinski published <a href="http://dx.doi.org/10.1111/j.1540-5907.2011.00518.x">a paper in <em>AJPS</em></a> that proposed new techniques for analyzing the very kind of experiment I had designed. Eventually, my paper adopted the analytic methods they proposed; I bootstrapped some code to implement their estimators. But later, in a review process, an editor caught that there was a computational error in my results. Hacking together an estimator led me to calculate some key statistics incorrectly; they were forgiving and I fixed the error before it went to print. But this taught me that we need reliable tools anytime we use new methods. For that reason, I wrote a package for R, called <a href="https://cran.r-project.org/package=GK2011">GK2011</a> (after the original publication), that implements their methods in a consistent interface, supported by <a href="https://cran.r-project.org/web/packages/GK2011/README.html">a complete replication of their analyses</a>, and <a href="https://github.com/leeper/GK2011">an open-source repository n GitHub</a> that faciliated version control, contributions from other users, automated testing, and general transparency. I learned that doing applied reserach often requires this kind of software development work even though it is rarely rewarded.</p>
</li>
<li>
<p>I learned about the need for analytic and methodological transparency. This project involved an innovative method (that reviewers initially found confusing) and analysis that was not simply out-of-the-box regression analysis. I put all of the code, data, materials, and the source code for the manuscript online in <a href="https://dataverse.harvard.edu/dataset.xhtml?persistentId=hdl:1902.1/17865">a persistent data repository</a> long before the paper was ever published. I learned during the time that this paper was not yet an article that authors are rarely so transparent. Now I hold myself to a higher standard.</p>
</li>
<li>
<p>I also learned that paywalls prevent researchers - especially those outside of the richest countries and even within rich countries outside of the richest universities - from accessing new research. A persistently available copy of this preprint <a href="https://s3.us-east-2.amazonaws.com/tjl-sharing/assets/PoliticalCommunicationSelfSelection.pdf">will always be online</a> and it won’t be hosted at a place like Elsevier-owned SSRN.</p>
</li>
</ol>
<p>These are some lessons learned from a research marathon that started when I was a second-year graduate student and ended when I was a tenured professor. Research takes time, it’s better when done openly and carefully, and ultimately it’s worth the wait.</p>Thomas J. LeeperAcademia is a marathon. While it can often feel like a sprint, with deadlines fast approaching and crowds running past you, ultimately everything takes more time than you first think. Case in point: my first ever graduate research project way back in 2010, which ultimately became a chapter in my dissertation which I finished in 2012, was finally accepted for publication last year in the Journal of Experimental Political Science and is online as of September 14, 2017. I never thought it would take eight years to see this research come to light, nor did a much more naive and much younger version of me imagine that it would go through a series of (sometimes painful) rejections. Yet the experience of carrying this project from a nascent idea in the back of the mind of an early graduate student through to publication in a peer-reviewed journal fundamentally transformed how I think about science and specifically open science. Here’s the story of this paper.Promoting First-Generation University Graduates in Academia2017-08-24T00:00:00+00:002017-08-24T00:00:00+00:00https://www.thomasleeper.com/2017/08/first-generation-academics<p>I recently tweeted my intention to promote the research of early-career researchers currently on the academic job market:</p>
<blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">on the academic market, send me a one-sentence summary of your research and a link to your website at t.leeper (at) lse dot ac dot uk. 2/2</p>— Thomas Leeper (@thosjleeper) <a href="https://twitter.com/thosjleeper/status/900450467662225408">August 23, 2017</a></blockquote>
<script async="" src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>Why am I doing this? Genuinely, I was inspired by <a href="https://twitter.com/maya_sen/status/895335421491560448">Dr. Sen’s continuing efforts</a> to promote academic job candidates from underrepresented backgrounds. It’s a laudable public service to take one’s own time to advance the careers of others. I have benefitted from the assistance of others, and I felt it was a good and straightforward way to pay that forward.</p>
<p>Why promote first-generation college graduates in particular? Why does this group matter? The answers to those questions are a bit multifaceted and is going to start with a bit of autobiography.</p>
<p>I’m a second-generation university graduate. I am a proud holder of a BA from my home state’s land-grant institution, the University of Minnesota. My parents both went to “The U”, as well. They were first-generation college students but I always knew I was going to go to college; that was never even a question. Yet the idea that university education is the default - that I grew up assuming there was no other path in life - reflects the naivete of educational privilege. Figures from the US Census Bureau suggest that <a href="https://www.census.gov/content/dam/Census/library/publications/2016/demo/p20-578.pdf">only about one-third of Americans have a bachelors degree</a>. Perhaps 40% of Americans have attended college or earned an associates degree, but for the vast majority of Americans college has never been a part of their life. It’s always been a part of mine and it is easy to see how that lived experience colors my understandings of academic life.</p>
<p>I have the privilege of having parents that helped me understand college and who encouraged me to pursue a PhD. Yet even with that context, I have always found and continue to find some aspects of academic life immensely confusing. I can’t describe them all here; maybe I’ll write a separate post about conference fashion at some point. Among these points of confusion was that when I started my PhD, I genuinely did not understand that it was an educational track that serves mainly as an apprenticeship for a lifelong academic career. I distinctly recall a conversation at my visit weekend at Northwestern when a faculty member said “The job market is looking stronger; we’ve placed candidates well recently” and I assumed they meant a job market that contained career paths beyond being a university professor. While that is changing, a PhD in social science remains largely a one-track endeavor.</p>
<p>In my continued professionalization, one of the main things that still surprises me is how many academics have parents that are academics. Some <a href="http://www.u.arizona.edu/~jag/POL602/firstGen-profKniffin.pdf">(unfortunately dated) statistics suggest that as sizable portion of the staff at research-intensive American universities have parents with postgraduate degrees</a>. Given that only 12% of the American public has postgraduate education, that’s rather startling. Academia and, it seems, academics tend to reproduce themselves.</p>
<p>To me it is obvious that those individuals coming into an academic career without the inter-generational support of university experience face the difficulty both of making sense of the confusing world of academia and trying to gain entry into a profession dominated by people - quite like me - who have basically known no other route. As universities worldwide wrestle with questions of promotion of diversity of background, it’s important that economic and social notions of diversity (which might reasonably be expected to shape research question, methods, theorizing, interpretations, and viewpoints) are considered. This should not be at the expense of considering other notions of diversity but rather as one additional aspect of diversity that is often invisible and often easy to forget.</p>
<p>There’s also a final reason that I think actually doing something to encourage and promote early-career researchers is important. As participants in “political science Twitter” will surely know, there are individuals with a reputation for <a href="https://sasconfidential.com/2015/12/02/on-student-shaming/">punching down</a> their own students or others they disagree with on social media. One person, who shall go unnamed, is particularly prone to punch down when matters of academic diversity come up with the retort that working class individuals deserve just as much or more attention than other groups. That’s all well and good to think that but unless that argument is followed by some action, it’s a meaningless rebuttal. Those of us with even a modicum of social media followership should use our positions for good, not evil.</p>
<p>So, if you are a first-generation college graduate who is now embarking on the task of starting an academic career, I would like to hear from you. I will be promoting the research of early career political scientists on Twitter under the <a href="https://twitter.com/search?f=tweets&vertical=default&q=%23poliscimarket&src=typd">#poliscimarket hashtag</a>. All subfields, methods, schools, ambitions, etc. welcome. If you’d like me to tweet about you, please send a one-sentence summary of your research and a link to your professional website to me at t.leeper (at) lse.ac.uk. No need to be wordy or super formal; don’t worry if we’ve never met or spoken; I promise I won’t bite. Tweets will go out the week of September 4th after the APSA meeting in San Francisco.</p>Thomas J. LeeperI recently tweeted my intention to promote the research of early-career researchers currently on the academic job market:The British are Indifferent About Many Aspects of Brexit2017-08-13T00:00:00+00:002017-08-13T00:00:00+00:00https://www.thomasleeper.com/2017/08/brexit-preferences-study<p>A summary of my recent research with Sara Hobolt (LSE) and James Tilley (Oxford) on the public’s preferences over Brexit negotiations, which uses a conjoint experimental design, has been posted on the <a href="http://blogs.lse.ac.uk/brexit/2017/08/13/the-british-are-indifferent-about-many-aspects-of-brexit-but-leave-and-remain-voters-are-divided-on-several-key-issues/">LSE Brexit Blog</a>.</p>
<p>A <a href="https://s3.us-east-2.amazonaws.com/tjl-sharing/assets/Brexit_Means_Brexit_Technical_Report.pdf">detailed technical report on the design and findings</a> is also available.</p>Thomas J. LeeperA summary of my recent research with Sara Hobolt (LSE) and James Tilley (Oxford) on the public’s preferences over Brexit negotiations, which uses a conjoint experimental design, has been posted on the LSE Brexit Blog.Cloward-Piven, but for Peer Review2017-05-15T00:00:00+00:002017-05-15T00:00:00+00:00https://www.thomasleeper.com/2017/05/cloward-piven-peer-review<p>In 1966, Richard Cloward and Frances Fox Piven proposed overwhelming the American welfare state in order to force revolutionary anti-poverty reforms. This <a href="https://en.wikipedia.org/wiki/Cloward%E2%80%93Piven_strategy">Cloward-Piven strategy</a> focuses on getting individuals who were eligible for social welfare benefits to claim them, thus putting strain on government budgets that presume some degree of non-participation, which would in turn force national reform of the public welfare model (in their hopes, in favor of a national basic income). The idea never came to fruition but has remained a talking point for more than fifty years.</p>
<p>What does a radical anti-poverty idea have to do with academic publishing? More than you might think. Cloward and Piven were concerned that the poor were not getting what they deserved and were legally entitled to, and their plan was simple: get what is rightfully yours. Academic publishing has something of a similar problem. Taking <a href="http://voxeu.org/article/nine-facts-about-top-journals-economics">economics as a case study</a>, the number of articles being published in “top” journals has been constant or even declining since the 1970s despite a substantial surge in the number of papers submitted for peer review.</p>
<p>Academic journals, in short, are processing more research than ever before but a smaller and smaller percentage of that research is being published. Perhaps this expanding set of submissions is just lower quality - that is of course possible - but if the distribution of quality is unchanged over time, then journals are simply selecting from an ever more extreme portion of the tail of that distribution. A 1980-era acceptance rate of 15% has become a 2010-era acceptance rate of 6%. This necessarily means research that would have previous been considered leading in its field is now relegated outside of the “top” journals that accumulate most academic citations.</p>
<p>Do researchers deserve to be published in top journals? Of course not, but research that used to be “good enough” for top journals no longer is because print publications have failed to adapt to the surge in research activity and the increasingly large number of contributions coming from outside of the traditional, American-centric, R1-type pool of research universities. Nearly everyone will agree that the best research should be published in top journals but it would be hard to construct a broadly agreeable argument that the definition of “best” should shift based solely on the amount of research being produced.</p>
<p>Consumers of academic research deserve to benefit from this increased research productivity. And academic researchers’ careers depend on publication, which seems to only grow less likely as the number of research-active faculty increases over time. So how can Cloward-Piven help? Two ways.</p>
<p>First, researchers need to not let low acceptance rates discourage them from submitting to top journals. If we overwhelm top journals with research, eventually more and more high-quality research will be rejected based purely on the grounds of limited publication space, driving dissatisfaction with the utility of these journals.</p>
<p>Second, reviewers working for top journals should stop making “good enough for [such and such journal]” decisions. Good research should be published in top journals, regardless of how much good research there is. Again, forcing editors to reject research based on the grounds of limited publication space will highlight the structural problems with current academic publishing models, driving dissatisfaction.</p>
<p>But what end does this dissatisfaction achieve? Hopefully, a fundamental change toward open access publishing with publication limits unconstrained by print journal limits and publisher constraints. Like Cloward-Piven, I won’t entirely hold out hope for this, but the idea seems worth discussing. As of yet there is limited buy-in to alternative publication models (at least in the social sciences) where open access and preprints remain controversial. But by overwhelming the system, we might be able to force disciplinary and interdisciplinary discussion of how to sustain science in the long run and hopefully - at a minimum - drive publishers to provide more space for top research and - hopefully - induce more systemic changes that enhance public access to scientific research.</p>Thomas J. LeeperIn 1966, Richard Cloward and Frances Fox Piven proposed overwhelming the American welfare state in order to force revolutionary anti-poverty reforms. This Cloward-Piven strategy focuses on getting individuals who were eligible for social welfare benefits to claim them, thus putting strain on government budgets that presume some degree of non-participation, which would in turn force national reform of the public welfare model (in their hopes, in favor of a national basic income). The idea never came to fruition but has remained a talking point for more than fifty years.Packaging Your Reproducible Analysis2016-11-14T00:00:00+00:002016-11-14T00:00:00+00:00https://www.thomasleeper.com/2016/11/analysis-as-package<p>There are a lot of debates about how to structure a reproducible research project. To be reproducible, a quantitative analysis needs to be <em>open</em>, that is it must contain the data and software needed to recreate the analyses from start to finish. Or, <a href="http://thomasleeper.com/2015/05/open-science-language/">as I have written before</a>, reproducible research is about recreating output from shared input(s). But once we agree on that general principle, how do we implement it in practice? Of course, the shared inputs (the data, code, etc.) need to be shared via a persistent, citable data archive - <a href="https://politicalsciencereplication.wordpress.com/2014/05/21/guest-post-why-reproducibility-requires-data-archiving-by-thomas-leeper/">not your personal website</a> - but in what form precisely should that shared input be organized?</p>
<p>One answer that I’ve been recommending recently is to construct a reproducible analysis as an R package. That premise - going to the effort to organize your code, data, and other files into an R package - can seem daunting if you’ve never created a package before. But, if you’re starting to work reproducibly and you’re trying to get your files organized anyway, there are some good reasons to think about your project as a package. The first of these is that you’re probably <em>almost</em> creating a package anyway. That’s crazy you might say. Not so.</p>
<p>Consider, for example, project file structures suggested by <a href="https://www.crcpress.com/Reproducible-Research-with-R-and-R-Studio/Gandrud/p/book/9781466572843">Christopher Gandrud</a>, <a href="http://kbroman.org/Tools4RR/assets/lectures/06_org_eda_withnotes.pdf">Karl Broman</a>, and others, which involve something like:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/data
/code
/slides
/paper
/misc
README.md
makefile
</code></pre></div></div>
<p>That structure yields an overarching directory to contain all project files, with subdirectories for data, analysis code, a manuscript and/or presentation slides, and possibly other files. This keeps things neatly organized, with similar files together, and some kind of overarching build process (e.g., <a href="http://thomasleeper.com/2016/09/make-make-make-again/">a <code class="language-plaintext highlighter-rouge">make</code>-based workflow</a>) that turns these inputs into the outputs that you care about (the finished paper, slides, etc.).</p>
<p>The key point is that if you already have a structure like the above, you’re 95% of the way to having an R package already, which has a structure like so:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/data
/R
/inst
/vignettes
README.md
DESCRIPTION
NAMESPACE
</code></pre></div></div>
<p>The only differences between the R package structure and a typical reproducible research workflow is the naming of some key directories (e.g., <code class="language-plaintext highlighter-rouge">/code</code> or <code class="language-plaintext highlighter-rouge">/analysis</code> must be <code class="language-plaintext highlighter-rouge">/R</code>), that <code class="language-plaintext highlighter-rouge">/paper</code> or <code class="language-plaintext highlighter-rouge">/slides</code> are now in a directory called <code class="language-plaintext highlighter-rouge">/vignettes</code>, and that R packages require two top-level files: <code class="language-plaintext highlighter-rouge">DESCRIPTION</code>, which describes the package and contains usefully structured metadata, and <code class="language-plaintext highlighter-rouge">NAMESPACE</code>, which describes what functions from <code class="language-plaintext highlighter-rouge">/R</code> should be available to those who install the package.</p>
<p>Similar to a <code class="language-plaintext highlighter-rouge">make</code>-based workflow, an R package provides a natural build workflow, where running <code class="language-plaintext highlighter-rouge">R CMD build package</code> on the command line will “build” the package, and convert mansucript files (e.g., <code class="language-plaintext highlighter-rouge">/vignettes/paper.Rnw</code> or <code class="language-plaintext highlighter-rouge">/vignettes/slides.Rmd</code>) into a finished, fully reproducible research output.</p>
<p>The packaging itself is painless and, if one prefers, requires no further work. To share your reproducible research then simply requires distribution of the package; end users desiring to reproduce or otherwise examine your results need only <code class="language-plaintext highlighter-rouge">install.packages("yourproject")</code> and examine the vignette(s).</p>
<p>But packaging offers some additional advantages:</p>
<ol>
<li>
<p>R’s quality control tool - <code class="language-plaintext highlighter-rouge">R CMD check</code> - provides a huge number of useful checks of your code quality and formatting, making sure you have parsable, executable code.</p>
</li>
<li>
<p>Adding additional directories, such as <code class="language-plaintext highlighter-rouge">/man</code>, further provides resources that are useful to other end users or your future self. These might, for example, describe your data structure and provide a data citation, and describe the use of your package functions with helpful examples. Through <a href="https://cran.r-project.org/web/packages/roxygen2/vignettes/roxygen2.html">royxgen2</a>, that documentation can easily be automatically generated from the kinds of code comments you may already be using to markup your code.</p>
</li>
<li>
<p>Distributing your package via a major repository, such as <a href="https://cran.r-project.org/">CRAN</a>, also means that your project - if at all useful to others - will be perpetually available and constantly checked for reproducibility on new versions of R.</p>
</li>
<li>
<p>Packaging creates a useful framework for adding “tests” of your code. In addition to the main output files produced from <code class="language-plaintext highlighter-rouge">/vignettes</code>, you can add a <code class="language-plaintext highlighter-rouge">/tests</code> directory to your package directory that will contain tests, which can be used to achieve many important objectives, including ensuring that your data takes the form you expect (particularly if you will update that data in the future), that your functions behave as you describe them, and that your package’s code generates the outputs you believe it should (with respect to object structures, classes, and numeric values).</p>
</li>
</ol>
<p>While these advantages are numerous, I think a major psychological hurdle for many researchers is that they do not think of their project as software development. Rather, they are only running existing code and therefore do not see the need for a package. This, however, misses the point that a package need to actually contain any new code (or any data for that matter). The <em>package structure</em> is what is most important because it enables the kind of automated workflows and testing that R already provides. If your project only contains a dozen lines of R code that can easily be placed in a knitr document, that’s not a reason against packaging! Indeed, that’s a sign that it will be really easy to package your project.</p>
<p>To make the point that this is fairly simple, I would point you to a recent R package that I produced which serves both as (1) a hopefully useful tool for those of us running a particular kind of survey-experimental study, and (2) an example of how to package a reproducible analysis. The package, GK2011, is available on <a href="https://cran.r-project.org/package=GK2011">CRAN</a> and <a href="https://github.com/leeper/GK2011">GitHub</a>. My purpose in writing the package was to provide an easy-to-use function for performing a particular experimental treatment effect estimate from a paper by <a href="http://doi.org/10.1111/j.1540-5907.2011.00518.x">Gaines and Kuklinski (2011)</a>. But to demonstrate that the package worked correctly, I included Gaines and Kuklinski’s original data and used the top-level <code class="language-plaintext highlighter-rouge">README.Rmd</code> to reproduce their empirical analysis. Thus the package is a simple example of a reproducible project, where the single analytic function is included in <code class="language-plaintext highlighter-rouge">/R</code>, the data is included in <code class="language-plaintext highlighter-rouge">/data</code>, some tests of the single function are in <code class="language-plaintext highlighter-rouge">/tests</code>, and the complete analysis is in the README (which could have just as easily been in the vignette directory). Using roxygen2 markup, <a href="https://github.com/leeper/GK2011/blob/master/R/estimate.R">the code</a> and <a href="https://github.com/leeper/GK2011/blob/master/R/ajps.R">the data</a> are fully documented, with examples that reproduce the results reported in the README. By making this an R package, <code class="language-plaintext highlighter-rouge">R CMD build</code> and <code class="language-plaintext highlighter-rouge">R CMD check</code> will verify that the code works as expected and can be installed by any other R user.</p>
<p>The GK2011 package is intentionally incredibly simple. It showcases how easy it is to make a reproducible research project into an R package and hopefully showcases that making that small leap comes with the advantages of natural and easy-to-use quality control tools.</p>
<p>Good luck with your packaging!</p>Thomas J. LeeperThere are a lot of debates about how to structure a reproducible research project. To be reproducible, a quantitative analysis needs to be open, that is it must contain the data and software needed to recreate the analyses from start to finish. Or, as I have written before, reproducible research is about recreating output from shared input(s). But once we agree on that general principle, how do we implement it in practice? Of course, the shared inputs (the data, code, etc.) need to be shared via a persistent, citable data archive - not your personal website - but in what form precisely should that shared input be organized?It’s Different Over Here, and Here, and Here…2016-11-08T00:00:00+00:002016-11-08T00:00:00+00:00https://www.thomasleeper.com/2016/11/different-here<p>Chris Blattman has put together <a href="http://chrisblattman.com/job-market/">a really useful list of advice</a> for PhD students on the political science and economic job markets. It’s a fantastic compilation of myriad resources and informal nuggets of wisdom. It also showcases the regularity of academic hiring processes in the United States. Given that I’ve now worked in two European countries, I thought I would throw in some quick caveats about how things work over here.</p>
<p>The most important of these is that <em>there is no European academic job market</em>. While people sometimes talk about “American political science” versus “European political science” to highlight differences in research paradigms, it is really important to recognize that Europe does not have a unified market in the way that the US does. Every country more or less operates independently, with completely different norms about academic training, whether or not postdoctoral work is expected or required, when hiring occurs, whether positions carry American-style “tenure”, what kind of teaching and research expectations are in place, and how interviews work. (Aside from the below, you may consider looking at <a href="https://www.cambridge.org/core/journals/ps-political-science-and-politics/article/introduction/ABFB4C5F5C99DA3DFFA1837D04EB379F/core-reader">the current issue of <em>PS</em></a> for some insight into disciplinary norms in Europe.)</p>
<p>The biggest issue, by far, in European academia is language. If you’ve met with, visited, or collaborated with European scholars and are coming from an American tradition, you inevitably worked in English because it is the operating language of most contemporary political science. While English is also the dominant operating language in European academia, national or local languages almost universally dominate teaching and administrative structures at European universities. Places that have English-language courses or degree programs inevitably teach in a local language at the undergraduate level. This has the consequence that some European universities and academic environments can appear inward facing or uninterested in hiring PhD students trained abroad. If you are interested in moving to a country where English is not the predominant language and you do not speak the local language, seeking employment in such a place will inevitably require learning a new language. This has obvious advantages (like making your life in that place more enjoyable and less isolating) but comes at the cost of other things you may be expected to do post-PhD (like research). If you offered an interview (or even a job) at a university that does not operate in a language you currently speak - regardless of what anyone tells you - you need to operate on the assumption that you will be expected to learn the local language in the near-term. Some places will put this explicitly in a job contract (e.g., permanent staff being required to teach in the native language after 1-2 years) but at other places it may be only an informal expectation. If you’re considering applying to a European school, keep this in mind. It’s one of the few universals across country contexts.</p>
<p>European PhD programs tend to be shorter than American programs. Typically they are 3-4 years, with a prerequisite masters degree. This means that those with European PhDs have been trained in an entirely different way from how American PhD students are trained. This carries with it expectations about what a newly minted PhD will have done. In many places, it means that a PhD student will not have published more than perhaps one paper. Non-permanent postdoctoral positions are therefore common in many countries. Many national research councils also fund postdoctoral positions independently of universities. Both countries I’ve worked in (Denmark and the UK) operate these schemes, as does the European Union under the Marie Curie banner. This means that if you are interested in moving to a European university for a postdoc, you will often either (1) be tied to a specific, council-funded research project or (2) need to apply directly to a national research council or the EU to obtain an “independent” postdoc. The advantage of the latter is that almost any university will gladly host a postdoc, due to the overhead that such positions provide. In many countries a postdoc is required before applying for a more permanent position, meaning as a new PhD from the US a European school may not even consider you for an Assistant Professor position.</p>
<p>However, given the diversity of institutional structures, there are exceptions and ambiguities. For example, in Denmark there is no distinction between a “Postdoc” and an Assistant Professor (“Adjunkt”). Both are three-year, temporary positions. Universities typically have no discretion to hire at this level; they can only accept positions awarded centrally by the Danish Research Council. This means there is no “market” per se; there are only periodic calls for positions from the council and periodically from projects that have been funded and include postdoc positions.</p>
<p>The “German system” is also an exception. If you’re not familiar, German academia operates on an old-school hierarchy wherein “Professors” are essentially the equivalent of US-style endowed chairs, typically with substantial budgets, teaching responsibilities, and oversight of other staff. If you are a professor, you will have numerous temporary staff and PhD students that work for you. Junior positions are typically offered by specific professors rather than by departments or universities and there is no job security. A six-year junior position in the German system has a near-zero guarantee of translating into a permanent position. Professorships are rare and upward mobility is unlikely. Contrast that with the Danish system, where a postdoctoral position is three-years, tenure-track, and in many cases will translate into an Associate Professorship (a permanent position).</p>
<p>These two examples highlight why there is no European market per se and often no national market either. Because junior positions are not necessarily offered by universities and are not necessarily tied to specific teaching needs, hiring occurs in a largely ad-hoc fashion. That also reflects the fact that PhD programs do not necessarily have “cohorts” in the US sense; students may begin and end their degrees throughout the year. If you are considering a move to Europe, you need to know that calls can occur at any time, with quick deadlines, and seemingly arbitrary start dates. The LSE and some US-focused universities attempt to hire during the US job market cycle, but not always.</p>
<p>A further consideration closely related to this complexity of hiring processes is the diversity of position types. In my experience, most European universities do not have American-style tenure. Labor market policies, however, mean that in many countries provide a general labor market protection in the form of a “permanent” contract. <a href="http://www.socialsciencespace.com/2015/07/so-how-does-tenure-work-in-europe/">Social Science Space has an article on how this works in Germany, the Netherlands, and the UK</a>. Permanent, open-ended, or indefinite contracts are akin to tenure but without the same degree of protection. Similarly, fixed, short-term, or temporary-contracts often provide no “tenure track” through which promotion might occur. In the German and Danish systems, for example, hiring of permanent positions works by open call rather than by assessing the merits of internal candidates. The lack of tenure does not necessarily mean insecurity; in a permanent position, you would have the same kind of labor market protections of any other worker with a similar contract. It’s just not “tenure” per se.</p>
<p>Hiring, then, can take many forms and can vary between position types. I have encountered postdoc hiring procedures that vary the full spectrum from an American-style, two-day interview with a job talk (LSE uses this for some temporary positions) to completely paper-based involving review of applications and a research statement. Similarly, these processes will vary in their formality from an individual researcher having individual discretion to hire a postdoc, to a committee-style decision, to a process involving external peer review and formal ranking of all applicants (see, for example, Scandinavia). For more permanent positions, there is the same degree of variation in procedures, and processes vary considerably across countries and across universities within countries and even across departments in the same university.</p>
<p>The “British” hiring process for any kind of position typically involves two core components: (1) a very brief (perhaps 20-min) research presentation wherein a candidate describes their general research profile and trajectory and possibly provides details on one ongoing project, and (2) a “panel interview” where the candidate is interviewed by a hiring committee. Somewhat awkardly, these interviews will tend to take place back-to-back on the same day as other candidates. Presentations, lunch, or informal conversation may be in the presence of other candidates for the same position. LSE traditionally has relied on this system and some departments still use it; ours has adapted to a more US-style approach (longer form job talks and one-on-ones) but retains features of the traditional system.</p>
<p>If you are offered an interview at a European university, it is really important that you ask about what precisely is expected for each portion of the interview and, if possibly, consult with academics based in that country to learn about how the process works. Some universities involve the broader department in discussions about hiring, while in many cases hiring decisions are at the discretion of a single project, department section, the hiring committe, or even solely the head of department. Again, there are no consistencies across universities or across countries, so it is vital to know what is expected of the specific place you are visiting.</p>
<p>While all of this can be rather daunting, it’s important to know that many European universities offer excellent conditions. There is of course considerable variation across richer and poorer universities and across countries, but many European countries impose moderately strict and typically generous (by American standards) rules about holiday and time-off, about parental leave, health and childcare benefits. The variation here is from essentially no benefits (e.g., in Southern European universities) to one-year parent leave and six-weeks of mandatory holiday (e.g., in Scandinavia). My experience is that the dual layers of federalism provided by the EU also mean that there are more (though possibly more competitive) opportunities for research funding from national and EU-level research councils, and many countries dedicate research funds specifically for junior researchers, for international researchers moving to the country, and for international collaborations. While teaching loads can vary enormously, they can in some cases be extremely low (e.g, a two-year, EU-funded postdoc will typically require no teaching). LSE has a moderate teaching load, but only two ten-week teaching terms, meaning teaching can be intense for short periods but much of the year is set aside for service responsibilities and research. The cost of all of this is that most countries have relatively constrained salary scales; it would be rare to make more as a professor at a European university than at a US R1. Whether that is worth it depends on your own preferences among fringe benefits, responsibilities, geography, and other considerations.</p>
<p>As a final, and probably the most important, point. If you are coming from the United States and decide to pursue an academic career in Europe, do not forget to consider your close and extended family and broader social circle. Moving abroad can be challenging, particularly to a country where you do not already speak the language. While European countries typically provide generous benefits for families and work permits for spouses, there is no guarantee that your partner (if you have one) will be able to find employment comparable to what they would in their home country nor that your social environment will be comparable to what you are used to. Everything can work out great, but it can also be challenging. I would encourage every American-trained PhD to consider spending time in Europe - I think it’s a great place to start and continue an academic career - but to do so only it’s something that will work professionally and personally.</p>Thomas J. LeeperChris Blattman has put together a really useful list of advice for PhD students on the political science and economic job markets. It’s a fantastic compilation of myriad resources and informal nuggets of wisdom. It also showcases the regularity of academic hiring processes in the United States. Given that I’ve now worked in two European countries, I thought I would throw in some quick caveats about how things work over here.The Surprising Value of Faking Your Data2016-11-04T00:00:00+00:002016-11-04T00:00:00+00:00https://www.thomasleeper.com/2016/11/value-of-faking-data<p>No scandal has so shaken political science as the revelation that one-time rising star Michael LaCour had fabricated the data underlying <a href="https://en.wikipedia.org/wiki/When_contact_changes_minds">a prominently published article</a> about the effects of face-to-face conversations on support for gay rights. The idea that so much press attention, scholarly thinking, and real-world resources has been expended on fraudulent data was troubling and it challenged our discipline’s faith in our credibility. But what if I told you that faking data and expending your time and effort analyzing and writing it up might have some payoff? The radical idea of this post is that faking your data could sometimes make you a better scientist.</p>
<p>Yesterday, <em>Nautilus</em> magazine published <a href="http://nautil.us/issue/42/fakes/the-cosmologists-who-faked-it">a fascinating story</a> by Jonah Kanner and Alan Weinstein about the scientific process behind this year’s splash discovery of <a href="https://en.wikipedia.org/wiki/First_observation_of_gravitational_waves">gravitational waves</a>, a phenomenon theorized by Einstein but previously unobserved. The detections were made by a massive international collaboration called LIGO, using machines designed to detect minute fluctuations in the shape of spacetime. The infrastructure to do this is massive and expensive, involving multiple, extremely precise laser-based measuring devices that wobble in response to movement (think super-precise seismograph). The data generated from these detectors is full of noise which much be further filtered to remove movements attributable to local events - rather than the super-distant black hole events of interest to science - things like an earthquake or someone breathing on the machine. The detections of gravitational waves are based on tiny signals that emerged from the noise of these localized influences.</p>
<p>The stunning part of the story is that long before these detections, LIGO had taken a remarkable step: they had agreed to secretly fake their own data! A handful of collaborators were empowered to work in secret and arbitrarily introduce a fake signal into the data at any point in time, without the knowledge of the rest of the team. The article describes the process in detail and tells the story of how the first signal that LIGO detected - a signal that would trigger thousands of hours of data analysis, discussion, writing, editing, and in-person meetings - was indeed a signal placed by this clandestine group. Here’s how Kanner and Weinstein describe the moment when the team of fakers reveals whether the data was real or not:</p>
<blockquote>
<p>In March 2011, we gathered in a hotel near Arcadia, California to review all the evidence and the paper draft, and vote on submitting it to a journal. There were more than 300 people in the room and about another hundred more connected via the Internet. We brought lots of champagne. We discussed. We voted to approve the paper draft. Speeches were made celebrating the long road we had traveled, from building incredible detectors, to finding a signal, to finally executing the entire procedure for claiming a detection. We opened the champagne.</p>
</blockquote>
<blockquote>
<p>Then Jay Marx, the director of the LIGO Laboratory, who had been carrying a tattered envelope in his pocket for more than six months, took to the stage […] and told us all that the Big Dog was a Big Fake, and that we had just completed the first successful discovery fire drill in gravitational wave observation history, we still treated it as a moment of celebration. We raised glasses of champagne, and toasted our fake success. It was a strange, hollow feeling.</p>
</blockquote>
<p>Champagne aside, the moment sounds chilling. All of the work of analyzing data, mulling over the results, debating them with coauthors, discussing how to appropriately characterize this tiny, one-off observation cost countless hours of work and mental anguish. And it was all apparently for naught.</p>
<p>Yet Kanner and Weinstein point out that whatever the effort cost, it also carried innumerable rewards. This fake detection that they didn’t know was fake had made them wrestle with issues they never would have anticipated until they actually had a signal in front of them:</p>
<blockquote>
<p>Big Dog had motivated a flurry of work, including big steps forward in our ability to measure the masses of the source objects (the neutron star or black hole) using only the gravitational wave signal. Most significantly, our collaboration had agreed for the first time what standards we’d use, and how we’d minimize our biases. For the first time, we had decided that we had enough evidence for a detection.</p>
</blockquote>
<p>Or as they say elsewhere in the article, “The fake injection bugaboo forced us to keep an open mind, apply skepticism and reason, and examine the evidence at face value.” While the collaborators had run simulations to try to see what a signal would look like and had thought about how to analyze the data and validate a hypothetical finding, that work was carried out under low stakes; everyone knew the data was fake. They didn’t have to take it seriously.</p>
<p>By faking the data, the scientists didn’t know whether the signal was real. But they did know that it could be fake. That helped them enforced the kind of radical, skeptical objectivity that is necessary for the scientific process to work - the kind of skepticism that is supposed to be a bedrock of scientific work. In reality, any signal - even one not planted by the fakers - could be nothing but noise. Sometimes we lose track of that as we analyze our data; we forget that this could be an error; the quest to find a signal in the noise can lead us down <a href="http://www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf">dangerous, forking paths</a>. The LIGO team’s courage to fake their data as a test of their own scientific practice sounds like it made them better researchers. And it should give us some additional faith that the 2015 detection they reported was not an overly optimistic interpretation of noisy data.</p>
<p>There’s something to be learned there for every scientist. While we, of course, can never publish fraudulent data and should never actively work to deceive others about our research claims, there may be some value in occasionally lying to ourselves. It can serve as a reminder about the unavoidable reality of false positives and as a check on whether we p-hack and engage in confirmation biases. I’ve previously advocated for <a href="http://thomasleeper.com/2015/05/fraud-prevention/">a “Ben-bot” intermediary between scientists and their source data</a> as a means of preventing fraud. A Ben-bot could also be adapted to sometimes lie to us to run us through an unanticipated fire drill (provided she or he eventually tells us it’s only a drill).</p>
<p>Indeed, this kind of scientific fire drill triggers exactly the kind of activity called for by advocates of study preregistration (precise specification of analyses, etc.), but it sets the stakes higher. It pretends that we have the actual data (rather than a simple, known simulation) and tests how we handle the situation. If you perform well in a drill, you should be better able to escape the real fire unscathed. If you perform badly, it’s a clear reminder to up your game. Maybe it’s worth thinking about faking our data more often.</p>Thomas J. LeeperNo scandal has so shaken political science as the revelation that one-time rising star Michael LaCour had fabricated the data underlying a prominently published article about the effects of face-to-face conversations on support for gay rights. The idea that so much press attention, scholarly thinking, and real-world resources has been expended on fraudulent data was troubling and it challenged our discipline’s faith in our credibility. But what if I told you that faking data and expending your time and effort analyzing and writing it up might have some payoff? The radical idea of this post is that faking your data could sometimes make you a better scientist.Treatment Self-Selection is Worth Studying Per Se2016-10-23T00:00:00+00:002016-10-23T00:00:00+00:00https://www.thomasleeper.com/2016/10/deaton-cartwright-selection<p><em>This is the third of a series of posts on Angus Deaton and Nancy Cartwright’s working paper, “<a href="http://www.nber.org/papers/w22595">Understanding and Misunderstanding Randomized Controlled Trials</a>.” See the <a href="http://thomasleeper.com/2016/10/deaton-cartwright/">first post</a> and <a href="http://thomasleeper.com/2016/10/deaton-cartwright-always-late/">second post</a>.</em></p>
<p>One of Deaton and Cartwright’s best critiques of experimental research relates to the use of “blinding” or disguising treatment status from participants in a study. In essence, the lack of blinding (awareness of treatment status) opens pathways between treatment-assignment to outcome other than the posited treatment, which is a violation of the exclusion restriction. It may be that knowing I am being treated is what causes my outcome to change, rather than the treatment I receive as a result of being assigned to treatment. Similarly, expecting that I will be treated (as a result of being assigned as such) may change my behavior in other ways - including by leaving the study, attempting to convert my assignment status, or compensating for (non)treatment through means outside the scope of the study.</p>
<p>These placebo effect and compliance dynamics are well-known to anyone trained in experimental methods. Indeed, much of experimental methods courses is consumed with how to design studies to minimize these issues and how to analyze the resulting data when these dynamics manifest. Deaton and Cartwright discuss for example, the three common analytic approaches in situations of non-compliance: <a href="https://en.wikipedia.org/wiki/Intention-to-treat_analysis">intention-to-treat (ITT)</a>, <a href="https://en.wikipedia.org/wiki/Analysis_of_clinical_trials#As_treated">as-treated</a>, and LATE/CATE analysis. This may be (perhaps painfully) familiar to anyone who has gone through experimental methods training. Blinding is meant to help avoid the choice between these methods and underlying ability of participants to modify their treatment status. Deaton and Cartwright, however, to point out that blinding as a design technique is used much less often in social science contexts than in medical studies.</p>
<p>(As a brief aside, it is worth noting that social science experiments - unlike medical trials - often engage in a form of blindness that medical trials do not. In a medical RCT, a participant knows they will be randomized. In a social science experiment, participants are often blind to the fact that a study is experimental. While knowing what treatment they receive opens paths between treatment status and outcome assign from treatment per se, blinding to the randomization device - or even to the presence of other treatment conditions - means that the kinds of non-compliance and placebo dynamics that threaten medical RCTs may not manifest at all because participants are unaware that they have the option to noncomply in ways that require knowledge of other conditions.)</p>
<p>Yet there is an important shortcoming in Deaton and Cartwright’s discussion. Their characterization of blinding - and the selection problems it introduces - focuses only on noncompliance, or what I might call “treatment self-selection” as a problem, as something to be eliminated. This is ironically quite typical of the view taken by most experimentalists. The advantage of experiments is to assign treatment, so anything that gives participants choice over treatment moves us outside of an experimental world back into one of observational data. If the researcher does not perfectly dictate treatment status, then why bother even experimenting!?</p>
<p>I have long taken issue with this view (by “long,” I mean since circa-2010 when I started writing my PhD dissertation). Self-selection of treatment is actually an interesting and understudied empirical puzzle that is worth studying in its own right. Indeed, much of my research has been focused on trying to understand how people behave when they have the ability to self-select their treatment status. And the experimental method has been critical to answering questions about self-selection. Let me explain.</p>
<p>Experiments are typically understood as assigning treatment versus control (versus other treatments). But what precisely is the treatment we want to understand the effect of? Is it, in a study of public attitudes, a media message? Or is a set of media messages? What is the control? Is it some alternative message? Is it nothing? Sometimes treatment is a message, but sometimes treatment is also <em>access to a message</em>. Sometimes control is nothing, but sometimes it is <em>the choice to not be treated</em>.</p>
<p>For example, in <a href="https://dl.dropboxusercontent.com/u/414906/AmericanPoliticalScienceReview2012.pdf">a 2012 paper</a>, my coauthors and I were interested in opinion dynamics over time. We randomly assigned people to either receive a message once and then receive nothing, or to receive the same message frame repeated multiples times, or to receive a message and then be given the choice of subsequent messages. Among other things we found that peoples’ self-selection of messages (when given the opportunity to choose) reflected the initial message they received and then the opinions they held at the end of the study were similar to those of individuals who had simply had the message repeated. Self-selection, rather than freeing people from treatment, actually reinforced earlier messaging. We wouldn’t have been able to learn this (or any of the paper’s other findings) without both randomizing people into different conditions and defining some of those conditions as spaces where participants were able to engage in self-selection.</p>
<p>As another example, <a href="https://dl.dropboxusercontent.com/u/414906/PoliticalCommunicationSelfSelection.pdf">one of my forthcoming articles</a>, shows that the effects of a randomly assigned treatment actually differ for those who want to receive it compared to those who would prefer something else. Through some clever analysis <a href="http://onlinelibrary.wiley.com/doi/10.1111/j.1540-5907.2011.00518.x/abstract">developed by Brian Gaines and Jim Kuklinski</a> (and <a href="https://cran.r-project.org/web/packages/GK2011/index.html">accessible as an R package that I wrote</a>), it was possible to provide some participants with the opportunity to choose what treatment they received in order to assess whether effects of a randomly assigned treatment in other experimental conditions had different effects on these subsets of the population. The randomization and blindness of some participants had to be forsaken in order to learn something from others.</p>
<p>As <a href="https://dl.dropboxusercontent.com/u/414906/KnowledgeGaps.pdf">a final example from my own work</a>, I have used very subtle violations of self-selection to study how people learn from the news. In what has come to be known as a “patient preference trial,” I allowed participants to choose what kinds of news they wanted to receive and then randomly manipulated a small portion of their chosen content and then measured every participant’s level of post-treatment knowledge. The non-blinded nature of the study was critical - I wanted participants to believe they were receiving the information they chose to receive. If they were randomly assigned to all content, it would defeat the study’s goal of studying effect heterogeneity. Randomization was indeed secondary to self-selection in this design; it was merely an easy analytic device to understand causal effects among those with different content preferences. If participants were blind, the experiment would have been meaningless; the entire point was that they were not blind.</p>
<p>These examples from my own work demonstrate cases where randomly allocated opportunities for self-selection that were completely non-blind to participants proved incredibly useful for understanding the influence of messages on political outcomes. Self-selection - who selects which treatment and with what effect - is actually a valuable research puzzle. Yet studying it by definition requires research designs where participants are not blind to treatment status. The demand for blindness may be useful in some contexts, but I think Deaton and Cartwright - and many experimentalists - overlook the many situations where non-blindness is useful and where experiments involving structured opportunities for self-selection can be particularly enlightening.</p>Thomas J. LeeperThis is the third of a series of posts on Angus Deaton and Nancy Cartwright’s working paper, “Understanding and Misunderstanding Randomized Controlled Trials.” See the first post and second post.