Excel allows you to use the random function to pull in random numbers distributed equally. For example, if you have Excel choose random numbers between 1 and 10 =RANDBETWEEN(1,10) then you will get an even distribution of results when done enough times. If you do the random search 10,000 times, then 1 should come up roughly 10 percent of the time. Same with 2 and with 3 and so on.
But what if we want to use Excel to choose something for us that is not evenly distributed?
Let’s look at this example here. Say we have a list of the top 5000 cities in the United Stated along with population. I am working on a template that will produce sample data. One of the fields is to pick a city. However, we do not want it evenly distributed where any of those 5000 cities can get picked evenly.
Here is some of the data:
We want to set this random generator to make it way more likely to randomly choose the most populated cities like New York city and Los Angeles, over a city with low population like Bunkie. This way the results will look more accurate when you see a large amount of sample data.
We will do this with a vlookup function and a random function.
What I will do is insert a column to add a running total of population.
In cell A1, I just copied over the largest population. But then starting in A2, I am using a formula. That is A1, which is the previous population and adding in the population from the current city, B2.
So now when I copy cell A2 down, you will see how the numbers incrementally rise.
When I copy it all the way down to the last city in the list, we get a total population of 175,172,773.
Now, I will create the vlookup for a non exact match to randomly pull in a city. I will put the formula in column F
Let’s break down this function:
Vlookup will look for a match in the table and pull in a city. So the first part is get the number we want to pull. So we use RANDBETWEEN to pull a random number between 1 and the total population of about 175 million.
Next, the Vlookup will look in the table A:C. I will look in column A for the random number it chose.
Since the last part of this function is TRUE, that means if it doesn’t find an exact match, it will pull in the next closest match below. You just have to make sure the numbers in the table array are increasing.
So once it finds the closest match, it will go to column 3, which is C and pull in the city.
Now, I can now copy this formula down and you will get many random cities.
The reason the results are weighted is because of the numbers you assign on the left of the table. For example, the most populated city has a population of 8.3 million. The second most populated city, Los Angeles has a population of 3.8 million. So New York city has more than twice the amount of chances of showing up than Los Angeles.
So when you want to get a specific weighted results from a random search, use the VLOOKUP function for non exact match. It will work like a charm!
To download the spreadsheet we used, download it here