Computational Communication Flashcards
Big Data meaning
condition in which the data are so large, complex, and/or variable that the tools required to understand it must be invented
Three different types of data
- Structured data
- Semi-structured data
- Unstructured data
Structured data
-well-defined, easily organized database information
-can go into spreadsheets for quick analysis
E.g. Yelp, star ratings are easy to store (stars are a way of categorization)
Unstructured data
-media posts on social media that doesn’t use a data model
-difficult to analyze for patterns
e.g. the posts and videos on social media platforms
Semi-structured data
-loosely organized into categories using meta tags
E.g. hashtags that help for quick searches of photo content which can be used to examine what types of posts lead to more engagement
What type of data has proliferated
unstructured data
previously data was structured
What is big data capable of?
big data sets have the potential to reveal and analyze patterns of of individual and group behavior
e.g. FB can determine if you’ll break up woth your partner before you do and will advertise products associated with this (e.g. ice cream)
How has big data “broken” American politics?
-overemphasis on big data has undercut meaningful poltical communication
-analysts no longer focus on centrism
-political campaigns target the beliefs of individual voters and maximize anger to get people to vote – worsened the partisan divide
Sources of big data
-digital life
-digital trace data
-digitalized life data
digital life data
capturing of digitally mediated social behaviors
digital trace data
the archival exhaust of modern bureaucratic organization
e.g. call logs, web page browsing data, social media likes and posts, location data, transactional data
digitalized life data
the movement of intrinsically analog behavior into digital form
e.g. what features of wedding dances charcterize a successful of unsuccessful marriage
*Two ways digital platform data may be viewed
- Generalizable microcosims of society (e.g. email and social networks)
- Distinctive realms where human experience now resides (e.g. Does FB accentuate or create an information filter)
*How is data obtained for research paradigms?
-observationl approches (digital trace data)
-theoretical approaches (computer simulations)
-experimental research (virtual labs and field experiments)
-computational communication (big data analytics and machine learning, such as AI)
has become difficult because platforms that were once open access (e.g. Twitter) are no longer
What is computational communication?
applying digital tools and automated methods to study human communication
Example of computational communciation?
Conducting social media sentiment analysis using AI to track public opinion during elections
e.g. Kamala Harris told to stay away from discussing trans rights because sentiment anlaysis revealed conservatives would twist the issue
Increasing availability of digital data
-shift from traditional media (e.g. print) to digital makes comm data easier to track
-Once difficult to track, media consumption can now be observed in online spaces
E.g. nielsen box vs online tracking (House of Cards show which is analytics driven)
rise of user generated data
Unlike past media, content dominated by institutional producers, today’s media landscape is driven by individuals creating content on social media and digital platforms
Requires new methods to analyze vast and diverse datasets
E.g. changes in beauty practices. Switch from glam to natural. company can use these analytics to switch product line
Emergence of new digital media analytics
-rise of AI powered tools like chatbots, voice assitants, recommender systems
- requires innovative research designs to study personalization, automation, and algorithm influence
e.g. chatbots that can analyze human behavior to calibrate the chatbot’s conversation – for example, if a customer is angry
e.g. personalization of emails to drive sales (such as “Hi Audrey”)
Advancements in Computational Power and Accessibility
Improved data STORAGE, computing POWER, and OPEN SOURCE programming frameworks have made computational research more feasible
Data collection techniques
APIs, web scraping, and tracking and data donation
Data collection: Application Programming Interfaces (APIs)
applications that allow access to data across mutliple platforms
e.g. Twitter API collecting tweets on a specific hashtag
Data collection: Web scraping
automated scripts extracting publicly available content
E.g. scraping news websites to study political coverage trends and how a political issue is being covered by different news sites
Data collection: Tracking and Data Donation
voluntary user contributions for research
e.g. Participants donating browsing history for misinformation studies
Research Design Innovations
Personalized content analysis: studies of how recommender system influence users
E.g. spotify recommending songs
Experimental designs with AI: testing user interactions with AI generated content
Social media interaction studies: examining how digital engagement influence attitudes
E.g. how emotionally charged language influences thoughts on a given issue
Analytical approaches
Supervised machine learning: algorithms using labeled data to train models to predict specific outcomes (e.g. Teaching an AI to detect fake news based on human-labeled examples)
Unsupervised learning (topic modeling): when algorithms identify themes without predefined labels (e.g. AI detecting dominant themes in political discourse from thousands of new articles)
Deep learning and neural networks: advanced techniques for processing text, images, and video (e.g. Detecting bias in media coverage using AI-driven semantic analysis)
Future Research directions
Expanding computational research beyond english and text based media (currently pulling form english language so there’s a heavy skew)
Understanding the societal impact of algorithms (including recommender systems, content filters, AI generated content shape public discourse, raising concerns about bias, misinformation, and algorithmic transparency)
Addressing bias in AI models
Studying the authenticity crisis in communication
Sampling and statistcal inference: sampling bias
Sampling bias refers to pulling from a sample that is skewed towards a certain perspective
e.g. drawing a sample from X is heavily represented from conservative and male oriented perspectives
Research example: sentiment analysis of public discourse
-Taylor Swift recetnly announced she’s taking some time off
-data collection: collect dataset of social media posts, news articles, and blogs that mention Taylor Swift
-sentiment analysis: use NLP technqiues to categorize commetary as positive, negative, or neutral
-network analysis: map the spread of information to determine key influencers and nodes that contribute to the narrative that she’s in the spotlight too much