Community Detection Flashcards
Social Media Data Mining
a method of trying to uncover the hidden patters, trends and other information gathered from a plethora of platforms (Facebook, Twitter and WhatsApp)
What are 2 social media data mining techniques?
1) Text Mining
2) Graph Mining
Text Mining
the process of extracting information from textual data
Graph Mining
extracting information from data represented in graphical form (focuses on community detection)
Information
relationships, associations and patterns among the data
Data
a collection of facts and statistics to be analyzed on which operations are performed by a computer and can be stored or retrieved
Slide 430
Knowledge
information converted to knowledge about historical patterns, which can be used for future predictions
Slide 430
What are the data mining use cases?
3
1) Trend Analysis
2) Event Detection
3) Social Spam detection
Trend Analysis
Trend analysis is the examination of data over a period to identify patterns, trends, and insights
Slide 431
Event Detection (social heat mapping)
Social media heat mapping involves using visual representations (heatmaps) to understand and analyze user interactions on social media platforms.
Slide 433
Social Spam Detection
platforms mine social media data to slowly but surely improve at detecting bots
Slide 434
What are the characteristics of a social network?
1) They are a collection of entities (like people)
2) relationships are a medium for collecting entities
3) If you have an entity A that is related to entities B and C then you can assume that B and C are related
What are the types of social networks?
1) Telephone networks (nodes = phone # & labels = # calls)
2) Email Networks (nodes = email addresses & weak = 1 direction & strong = bidirectional)
3) Collaboration Networks (nodes = indviduals public papers & labels = # papers)
Explain the Girvan-Newman Algorithm?
- Edges same level dashed line
- Solid edges = Directed Acrylic Graph (DAG)
- Step 1: Breadth first search
- Step 2: labelling the nodes (root = 1, all other nodes is sum of parents)
- Step 3: Calculate the betweeness (leaf nodes with no child = 1, leaf nodes split between multiple parents split node, other nodes = 1 + sum of credits below)
- # of each edge is the contribution to the betweeness of that edge
- Remove highest betweeness score
Slides 445-447