IDeaS 2019 Conference: Perspectives (I)

Aug 11

By Stephanie Habersang

Session 1: Interpretive data science: Rendering meaning ‘in the wild’

Vern Glaser kicked-off the first session of this fantastic workshop. The aim of this session was to give the audience an idea what type of research we can do with topic modeling. Laura Nelson, Tim Hannigan, and Mark Kennedy reflected on different empirical examples in which they used topic modeling to build theory in social science research. The first empirical example was provided by Laura Nelson. She presented a compelling example how topic modeling helped her to identify new and overlooked tactics in environmental social movements and how these tactics were used by the movement to achieve change at different levels of society. Her research challenged current assumption in which political ideology is seen as a key dimension to distinguish different environmental social movements. Rather she postulates that it is a movement’s goal orientation, that is, at which level of society the movement claims responsibility to initiate environmental change (e.g. individual, collective, or institutional). This example was extremely interesting as it showed the potential to challenge current assumptions and build new robust theory with topic modeling. Another interesting application was presented by Tim Hannigan. In his research he rendered the kernel of a corporate scandal in the British parliament by studying the micro-processes of stigma. He showed that the extent of a scandal and MPs resignation did not depend on the degree of the scandal itself but rather the laughability of the scandal in the first seven days after disclosure. What I think was particularly interesting is that this research was initially not conducted as a project to understand the micro-processes of stigma. Instead it was the iterative back and forth between research question, data analysis, and theory building that finally let to the framing of the paper. Hence, this example nicely illustrated how topic modeling enables abductive reasoning and resonates with a qualitative, interpretive approach to theory-building. Last but not least, Mark Kennedy provided an insightful refection on both presentations. He advocated to “team-up” and to build stronger bridges between qualitative and quantitative research. Teaming up and being a community means that we can both manage the risks and opportunities from big data in a more reflective and fruitful way.

From my perspective, the three panelists’ did not only provide fascinating insights on the type of issues interpretive data science can address, but they also discussed some fundamental implications how to use topic modeling. First, Laura Nelson emphasized the importance of context. Understanding the context in which the data is embedded is essential for interpreting the results. She made it very clear that interpretive data science does not seek to identify universal patterns or physical laws that can be applied to all contexts. Rather interpretation is rooted in a qualitative understanding of the data. This understanding must be used to give voice to marginalized topics/issues in the data and to show diversity rather than uniformity. In line with this argument Laura Nelson skillfully concluded what interpretive data science should be all about: (1) meaning making not universal patterns (2) understanding not social law, and (3) contextual not universal understanding. Second, another very important implication came from Tim Hannigan’s presentation. It is often that we can best grasp the meaning of a large data set through one powerful exemplary story or case. In his example it was the illustrative case of one MP. However, a compelling single story must be supported by strong visualizations and representations. This does not only involve creativity but also exploration and computational skills. Finally, Mark Kennedy emphasized that differences between qualitative and quantitative researchers are less profound than we often think. By combining our complementary skills and using the community to enhance our toolkit we might be able to better explain, understand, and predict social phenomena. Overall, this introduction panel was the perfect kick-off for an inspiring workshop that fostered an inclusive climate to develop a multi-disciplinary community.

Session 2: Publishing papers with computational methods

Publishing with computational methods was the subject of the second session. The panel speakers Mark Kennedy, Richard Haans, Hovig Tachalia, and Muhammad Abdul-Mageed had a vibrant discussion about the possibilities and pitfalls in publishing interpretive data science. In the beginning the panel discussed the “coolest thing” that they recently saw on topic modelling. The panel shared examples from the material sciences, from discourse studies on Brexit, from the field of deep learning, and from management research where topic modeling is increasingly used as a first step to create an abductive leap in grounded theory methodology. Furthermore the panel discussed how topic modeling may help us in doing research. Computational methods are definitely helpful in enhancing human coding procedures, identifying general patters (that we might not see in smaller datasets), and in challenging existing frames. Similarly, computational methods can also help us to reduce type 2 errors and decrease the likelihood that we miss interesting findings.

However, the panelists’ also acknowledged the challenges that come with using a new method and communicating this method to the general reader. There are a couple of strategies that the panel highlighted as helpful to convince editors and reviewers to publish a paper that builds upon a new method: (1) Using computational methods to validate previous findings (theory testing and validation), (2) Showing that results persist even if models change (e.g. as additional robustness checks for new theory building); (3) Using online appendices to explain complex methodological issues and keeping it simple in the actual method section; (4) Publishing a methodological paper beforehand; (5) Optimizing and actively managing the reviewer pool to get fair and proficient feedback; and finally (6) Presenting the paper draft as often as possible to many different people before submitting (get ideas out to potential editors and reviewers early on). An important learning from this session was that we should not take institutions (e.g. journal standards) for granted. Although most journals change very slowly and stick to tried-and-true methods, many editors are becoming increasingly open for new methodological ideas and representations. As such, the overall recommendation of the panel was to build a community and dare to publish interpretive data science also in general management journals.

Session 3: Meaning and interpretation: Is big data any different than small data?

In the afternoon Joseph Porac draws from his large expertise in socio-cognitive dynamics as well as computerized text analysis and gave a highly interesting talk about “Meaning and interpretation: Is Big (Text) Data any Different than Small data?” In his talk he presents some very convincing examples what we can learn from the interpretation of small data to better interpret and translate big data. Drawing on two extreme translation concepts (Derrida’s non-referential deconstruction, in which translating any text into stable meaning is almost impossible vs. google translation, in which universal translations are easily generated in a “quick-and-dirty” manner or “good-enough” fashion) Joseph Porac exemplifies the difficulties that we face when we talk about meaning-making and translation. He points out that basically everything is always about meaning-making, e.g. how we interpret the results of an experiment, how we interpret a stylized questionnaire to develop meaningful questions for a certain research setting, and how we attach (or cannot attach) meaning to things that we have not experienced on our own.

He then draws on Peter Winch’s 1958 book “The Idea of a Social Science and its relation to Philosophy” and introduces three theses to illustrate that the issues of meaning and interpretation are deeply embedded in the idea of social science – independent of big or small data. The first thesis he discusses is the “the rule thesis”. This thesis postulates that understanding human language use involves seeing the rules or properties in accordance with which it is produced, not just regularities in its production. If we wish to understand how a person represents things, in particular what they say about things, then we need to know the rules that govern their thoughts and words. We need to know what would make it right for them to say what they say, and what would make it wrong. In this sense, the context from which these rules emerge is fundamental for understanding – and this line of thinking is equally applicable when we want to interpret big data. Second, “the practicability thesis” states that understanding human language use does not mean just grasping the intellectual ideas that permeate it but, more deeply, cottoning to the practical orientations of the actors. While human language and action essentially involve rule-following, the rules in question cannot all be grasped in an intellectual manner. The main take-away here is that rule-following ultimately rests on a foundation of practice. Finally, the third thesis is the “the participation thesis”. It states that understanding human language use involves participating in the society of the agents, at least in imagination, not just standing back and surveying that which they are doing. This is closely related to the aspect that interpretation involves empathy for the ones that we research. As such, it does not matter if we interpret small or big data we as researchers cannot and should not act as detached observers.

The important take-away from this talk was that the above mentioned issues of meaning and interpretation will not “go away” with the rise of big data. On the contrary, with big data these issues may be exacerbated. And although big data becomes increasingly important in social science, small data and the insights that we can draw from it will not be going away anytime soon. Hence, Joseph Porac closed his talk by emphasizing the necessity of an interpretive data science.

Session 5: What happens when we govern with numbers?

While the previous day was all about interpretive data science, the second day of the workshop focused more broadly on the politics of big data. The morning session started with the amazing Wendy Espeland and her talk about “What happens when we govern with numbers” or how do people do things with numbers and what are the sociological consequences? While Prof Espeland acknowledged the positive side of quantification, for example that it can make previously invisible groups visible (e.g. large-scale studies about LTGB movements); she also emphasizes more critically the performative aspect of numbers. Instead of attributing essential values to governing with numbers we must understand its implications on power. Wendy Espeland used three powerful examples to critically reflect on governance with numbers: sentencing guidelines, university rankings, and cost-benefit analysis. For example, university rankings were initially developed to provide information and transparency helping people to make better decisions which school to attend. Over time, however, university rankings have merged into something new: a regime of surveillance. This regime locks universities into a competitive system that forces deans to care more about a “winning season” than working toward long-term impact. While this was an unintentional shift in governance with numbers the consequences are very real and irreversible. Hence, quantification can reorganize power structures, just as the emergence of college rankings chipped away at the power of university deans.

The main take-away from this speech was that numbers do things and they do organize social life. As such, we should be worried and sensitive to the unintended consequences of governance by numbers. Once we quantify something and its gets out we cannot control what other people do with it. Thus, Wendy Espeland challenges us to consider five rules when we study governance by numbers: (1) Follow the number over time, (2) What happens to power and accountability, (3) What happens to status, (4) What happens to visibility, and finally (5) How can the number be challenged? The powerful message that remains from this talk is that ethics, morality and politics are fundamentally intertwined with numbers. All of us who work with numbers and interpret numbers have a responsibility to constantly ask: “What is there that we don’t see? How can we make the invisible visible? Who benefits and why?”

Session 6: The politics of big data

In the panel session “The politics of big data” Chris Steele discussed with Wendy Espeland, David Krisch, Dev Jennings, and Joel Gehman truth(s) and fact(s) in the context of big data. Chris Steele started with an interesting introduction on the question “How are facts made?” using practice-driven institutionalism to introduce an ecology of fact-making. For the study of data per se he concludes that revealing the political ecology of facticity within which data is made and within which its consequences arise become fundamentally important in our field.

The main take-away from this panel discussion was that big data can easily give the impression that we overestimate what we can know - simply due to the sheer amount of data that is available. However, all data (big and small) may reveal some things and exclude other things. Therefore, the panel calls for a strong emphasis on reflexivity in collecting, analyzing, and interpreting big data. It is essential that we critically examine what we see and what we do not see/exclude in the data. It is our responsibility as scholars to understand the “taken-for-granted” assumptions that underpin our data (collection) and shape our interpretation. One way to reflect upon our own taken-for granted assumptions is constantly asking: What must be true for the data that we collect? It is important to think about the potential biases that we build into the data, not only during data collection but also by using the wrong methods.

Session 7: HIBAR Research – Exploring how research can make a difference in the real world

Marc David Seidel introduced the HIBAR approach in the very last session. HIBAR stands for highly integrative basic and responsive research. This approach seeks to combine a desire for discovery and a desire to solve major problems (often related to grand challenges) through collaboration between academic and non-academic experts. The HIBAR approach highlights collaboration that is interdisciplinary and transdisciplinary and supports diverse expert teams to make a real difference in the world. This keynote provided an inspiring closing as Marc-David Seidel reminded us to think about solving important problems and daring to cross disciplinary boundaries. In this regard, interpretive data science might offer a promising opportunity to bring together scholars from various fields (e.g. Management, Information Systems, Sociology, Sustainability, etc.) with different abilities (e.g. qualitative or quantitative methods) to tackle important real world problems.

Rodrigo Valadao

IDeaS 2019 Conference: Perspectives (I)

By Stephanie Habersang

IDeaS 2019 Conference: Perspectives (II)