Initials data, part I
Here’s something I’ve been working on a while. It's part of a bigger project, but the individual stages are neat, too.
It’s easy to go to the Social Security Administration website and look up the most popular baby names by year. I grouped together some of these data to figure out the most popular first initials by year, starting with 1986, my birth year. The most popular first initial for girls born in the US in 1986 was “A”, with 15.47% of all girls. Five “A” names were each given to over 0.5% of all girls: Ashley, Amanda, Amber, Amy, and Angela. The most popular first initial for 1986 boys was “J”, accounting for 19.92% of boys. “J” was mostly carried by the strength of Joshua, James, John, Joseph, Justin, Jonathan, and Jason. “F”, “I”, “O”, “Q”, “U”, “X”, “Y”, and “Z” are all rare first initials for both sexes.
I know humans have been scientifically shown to be bad at reading pie charts, but I think it actually works here:
An important note: the SSA website only gives the 1000 most popular names for each sex for each year. These 1000 are my only source for the statistics above, so they shouldn’t really add up to 100%. In 1986, the first 1000 girls’ names added up to 80.19%, while the boys’ added up to 90.07%. This is interesting in itself; it suggests girls are more likely to receive less popular (more unusual?) names than boys—maybe twice as likely.
But it also means that I extrapolated the remaining parts of the population. So strictly speaking, all I can say is that for girls, “A” names made up at least 12.41% of the sample, and for boys, they made up at least 7.61%, and so on with the other letters. I think it’s reasonable to assume that the remaining parts of the population roughly agree with the distribution found in the first 1000 names, but I can’t be sure.


