Subj : implementing zipf distribution? To : comp.theory,comp.programming From : Digital Puer Date : Sat Aug 06 2005 06:46 pm I am writing a simulation and would like to implement a zipf distribution. I could use some help. Suppose there are 20 files dpulicated across 100 servers, where each server can hold 5 unique files. I would like to distribute the 20 files among the servers with a zipf distribution. I think that means that the i-th most popular file appears with probability 1/i^a, where i=1 is the most popular and a is supposed to be near 1 (say, it is 1). Each server needs to select 5 files. Is the following a good algorithm? - create a vector - fill the vector with integers where each integer is in the range [1,20] representing the 20 files - integer "1" appears, say, 1000 times - integer "2" appears 1/2 as many times as first (500) - integer "3" appears 1/3 as many times as first (333) - integer "20" appears 1/20 as many times as first (50) - each server then randomly selects 5 integers from this vector without replacement. Also, no duplicates are allowed for a given server, so keep choosing (and putting back) until 5 unique integers are taken Is the above correct? The problem I see is that the most popular file (file "1") is selected with probability 1000/(1000+500+333+...+50), which doesn't jive with the 1/i definition. It also appears that there has to be a starting numer of appearances for the most popular item (in this case, 1000) as a parameter to the problem. Thanks for any help. .