I90 Final Project

Code due Tues Nov 20, Presentations on Tues, Nov 27

Social Graph Simulation and Analysis

In this project, you will simulate the diffusion of a new technology through a population of users. Each user has connections to some number of other users, forming a social graph. At the beginning, the new technology is used by a small number of users called first adopters. Over time, the technology spreads along the graph connections until all users have adopted it. At this point, we say that the graph is fully saturated.

Your programming objectives are as follows:

1. Graph Construction. You will need to write two classes, a Graph class, representing the entire population of users, and a User class, representing a single person in the graph. These classes are outlined with a number of function headers in the final project framework code. You should fill in these functions and add more attributes as necessary to complete the simulation.

The graph constructor takes a population as a parameter and creates new set of users, numbered 0, 1,..., population - 1. The number of a user should be returned by its get_id() method. When a new graph is constructed, there should be no connections between users and no users should have the new technology.

You should write two methods to add connections to your new graph. The circle_connect() method should connect each user i to user i + 1, and user 0 to the last user, user population - 1. This is an important method because it guarrantees that the graph is connected, and the new technology can eventually reach every user. On the other hand, the random_connections() method should add the given number of random connections between users. Make sure that you model each connection as symmetric. If user a is a friend of user b, user b is also a friend of user a.

2. Time Simulation. You will run what is known as a discrete time simulation on your graph. This means that the model has rules that govern how the state changes from one time period to the next. Before the simulation begins, select a small number of agents to be first adopters of the technology. In each time period, each user with the new technology has a probability of passing it on to each user he or she has connections to (each connection should be calculated independently). This probability should be passed into your graph's time_step method as parameter prob. If the agent with the technology passes it to a neighbor, that neighbor will have the new technology in the next time period, and every time period from then on.

Warning: A common mistake in this type of simulation is in letting the technology pass through multiple connections in one time period. Suppose user 0 has the new technology at time 0. User 0 is connected to user 1, and user 1 is connected to user 2. If your program processes user 0 first, she might pass the technology to user 1. If your program then looks at user 1 and sees that he has the new technology, it might allow user 1 to immediately pass the new technology to user 2, so all three users might have the technology at time 1. That should not be allowed to happen, because the new technology should only move through one connection in each time period. To prevent this situation, you must carefully separate one period from the next. User 0 has the new technology at time 0, so if she gives it to user 1, user 1 will have the technology at time 1, and user 2 can only get the new technology at time 2.

To assist with grading, please use the function headers given in the framework code.

3. Graph Analysis. In the final part of the project, you will play the role of a company that has created a new technology and hopes for it to spread to all the users in the graph quickly. You will write a new class, GraphAnalyzer, which contains code to examine a graph. The GraphAnalyzer constructor should take a Graph object as its one parameter. The most important method in GraphAnalyzer is choose_users, which returns a list of n users to be first adopters of the new technology. You may imagine that the company targets these users with special promotions to ensure that they have the new technology at time 0. The idea behind this method is to choose users so that the technology reaches the entire graph in the shortest possible time

To write your choose_users method, you will have to decide on a strategy for selecting users. You may consider, for example, the ids of the users, how many connections they have, the degrees of separation between users, and so on. Your method may be as simple or as complex as you like, but we will see which team has the best strategy with a final class competition.

For the class competition, we will create a number of graphs and use your GraphAnalyzer to choose the first adopters. We will do this a large number of times and record the average time it takes for your technology to saturate the entire graph. The team with the shortest average time will be winner.

Note that we will be using our own Graph and User classes to generate graphs for this competition. This ensures that all teams will compete with the same graphs. You must write your GraphAnalyzer so that it only calls the Graph and User methods provided in the framework code. Do not add another method to Graph or User and then call this method from within GraphAnalyzer, since the method may not exist in the competition graph objects.

For this competition, we will use graphs of 500 users. We will circle connect each graph, then add 1000 random connections. Your GraphAnalyzer will be used to select 4 users, and we will run the simulation with prob = 0.25.

Deliverables

1. You should prepare a 5 minute presentation for the class. You may demonstrate a run of your simulation, discuss your strategy for choose_users, or present any other parts of your code that are interesting.

2. Please place your script in a single file, final_team_i.py, where i is your team number, and email it to kay at ischool.