Framework

RQ1: How to Simulate Large-scale Social Movement Participants?
User engagement in social networks often exhibits a Pareto distribution, where the bulk of content originates from a small fraction of individuals. Thus, those more active and influential such as opinion leaders should be modeled finely, while the silent majority can be controlled by simpler models. The overall framework is illustrated in the Figure, where social media users are divided into core users and ordinary users. The two types of users are driven by different models, to address the cost and efficiency issues of using thousands of LLMs.
- Core users are empowered by large language models.
- Ordinary users are driven by conventional agent-based models, such as the Bounded Confidence Model.
RQ2: How to accurately simulate core users and replicate their behaviors within the community?
We build an agent architecture by empowering LLMs with the necessary capabilities for core user simulation. An overview of the agent's architecture is illustrated in the left part of Figure 3. Empowered by LLMs, the agent is equipped with a profile module, a memory module, and an action module.
- Profile Module: each agent's profile contains Demographics, Social Traits and Communication Roles initialized from real data.
- Memory Module: a memory module is integrated to manipulate memories of agents, including three operations - memory writing, memory retrieval and memory reflection.
- Action Module: We consider actions that are highly related to information and attitude propagation, including: post, retweet, reply, like, and do nothing.
RQ3: How to Evaluate the Effectiveness of the Simulation?
To comprehensively evaluate the effectiveness of simulation, we consider both micro-level evaluation and macro-level evaluation, focusing on individual user alignment and systemic outcomes respectively.
- Micro Alignment Evaluation: simulate in single rounds by providing authentic contextual information to each core user agent and assess their decision-making, in terms of stance, content and behavior.
- Macro System Evaluation: quantify the attitude distribution in a complete multi-round simulation, considering both static attitude distribution and time series of the average attitude.
Examples of Generated Content of Core User Agents
To simulate core users, we provide the LLM-empowered agents with the following information and ask them to give a thought before taking actions and then decide what actions they would like to take. (1) profile or description of the agent itself; (2) memory of the agent; (3) triggering offline news, if events happen; (4) the tweet page showing tweets visible to the agent; (5) notifications containing replies to the agent. Here are some examples.
*User information has been anonymized.
PROMPT:
You are using the social media Twitter. You might need to perform reaction to the observation. You need to answer what you will do to the observations based on the following information:
(1) You are J***n. J***n is a highly active and moderately influential activist on social media. He is a liberal progressive and anti-Republican. He frequently shares and discusses ideas and opinions, often challenging or dismissing the ideas of others. He is passionate about topics he strongly believes in and wants to share conversations for the benefit of others.
(2) Current time is 2018-01-06 20:00:00
(3) The news you got is "1. A month ago, President Donald Trump enthusiastically endorsed Roy Moore, the Alabama Senate candidate accused of sexual misconduct, on Twitter. 2. At the Golden Globes Awards ceremony in Los Angeles, most guests showed up dressed in black out of solidarity with the MeToo and Time's Up movement and the victims of sexual violence."
(4) Your history memory is
(5) The twitter page you can see is
(6) The notifications you can see are
In terms of how you actually perform the action, you take action by calling functions. Currently, there are the following functions that can be called.
- do_nothing(): Do nothing. There is nothing that you like to respond to.
- post(content): Post a tweet. `content` is the sentence that you will post.
- retweet(content, author, original_tweet_id, original_tweet). Retweet or quote an existing tweet in your twitter page. `content` is the statements that you attach when retweeting. If you want to say nothing, set `content` to None. `author` is the author of the tweet that you want to retweet, it should be the concrete name. `original_tweet_id` and `original_tweet` are the id and content of the retweeted tweet.
- reply(content, author, original_tweet_id). Reply to an existing tweet in your twitter page or reply one of replies in your notifications, but don't reply to yourself and those not in your tweet page. `content` is what you will reply to the original tweet or other comments. `author` is the author of the original tweet or comment that you want to reply to. `original_tweet_id` is the id of the original tweet.
- like(author, original_tweet_id). Press like on an existing tweet in your twitter page. `author` is the author of the original tweet that you like. `original_tweet_id` is the id of the original tweet.
Call one function at a time, please give a thought before calling these actions, i.e., use the following format strictly:
[OPTION 1]
Thought: None of the observation attract my attention, I need to:
Action: do_nothing()
[OPTION 2]
Thought: due to `xxx`, I need to:
Action: post(content="yyy")
[OPTION 3]
Thought: due to `xxx`, I need to:
Action: retweet(content="yyy", author="zzz", original_tweet_id="0", original_tweet="kkk")
[OPTION 4]
Thought: due to `xxx`, I need to:
Action: reply(content="yyy", author="zzz", original_tweet_id="0")
[OPTION 5]
Thought: due to `xxx`, I need to:
Action: like(author="zzz", original_tweet_id="1")
Now begin your actions. Remember only write one function call after `Action:`.
Based on the above history, what will you, J***n, do next?
RESPONSE:
Thought: The observation about the Golden Globes Awards ceremony and the MeToo movement is significant and aligns with my progressive values.
Action: post(content="The solidarity shown at the Golden Globes Awards ceremony is inspiring. It's time to stand up against sexual violence and support the victims. #MeToo #TimesUp")
Examples of the Overall Simulation Results
We observe the simulated systematical outcomes at the macro level. The scenario can be described as follows: when a specific event occurs, Twitter users discuss and exchange opinions, individual opinions may change, resulting in an overall change in the average sentiment of the system. Here are some examples.

Break the Echo Chambers
We aim to explore strategies to mitigate echo chambers while safeguarding users' freedom of expression. We propose and test three solutions in our
simulation framework:
- S1: Feeding the opposite opinions;
- S2: Feeding the neutral opinions;
- S3: Establishing public spaces for debate or discussion, achieved by encouraging users to share opinions using platform-provided public hashtags.


By measuring the homogeneity of content of consumption and production and the toxicity of generated content, as shown in Table 1, we have discovered:
- S1 can reduce the echo chambers, but the introduction of opposing opinions can largely increase the toxicity of the community.
- S3 establishing spaces for open discussion can promote more peaceful exchanges, where the toxicity is at the lowest level.
Related Works about Social Media
- Hanjia Lyu, Jinfa Huang, Daoan Zhang, Yongsheng Yu, Xinyi Mou, Jinsheng Pan, Zhengyuan Yang, Zhongyu Wei, Jiebo Luo.
SoMeLVLM: A Large Vision Language Model for Social Media Processing
- Xinnong Zhang*, Haoyu Kuang*, Xinyi Mou, Hanjia Lyu, Kun Wu, Siming Chen, Jiebo Luo, Xuanjing Huang, Zhongyu Wei.
PASUM: A Pre-training Architecture for Social Media User Modeling based on Text Graph
- Kun Wu*, Xinyi Mou*, Lanqing Xue, Zhenzhe Ying, Weiqiang Wang, Qi Zhang, Xuanjing Huang, Zhongyu Wei.
Align Voting Behavior with Public Statements for Legislator Representation Learning
- Xinyi Mou, Zhongyu Wei, Lei Chen, Shangyi Ning, Yancheng He, Changjian Jiang, Xuanjing Huang.