top of page

Statistical Problems & solutions in onomastic research

Exemplified by a Comparison of Given Name Distributions in Germany Throughout the 20th Century


Available on SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1722527


Abstract: The German Socio Economic Panel Study (SOEP) offers the rare opportunity to look at patterns of given names amongst a representative sample of more than 50,000 people born since 1900. This article develops an exemplary picture of typical frequency distributions for given names and their developments over time. In this paper, we first discuss the advantages and limitations of various data bases which have been widely used to study the distribution of given names. Second, we address the problem that name distributions are typically characterized by a "Large Number of Rare Events" (LNRE) zone. With regard to this, we focus our attention on the difficulties associated with comparing name distributions. Third, we apply some measures of the concentration of distributions from other lines of research (economics and computational linguistics). Finally, we stress the problem of the statistical significance of differences in name distributions based on samples.

Keywords: Given names, large number of rare events (LNRE), concentration of distributions, SOEP

JEL Classification: C49, C83, Y8

Suggested Citation: Huschka, Denis and Wagner, Gert G., Statistical Problems and Solutions in Onomastic Research - Exemplified by a Comparison of Given Name Distributions in Germany Throughout the 20th Century (November 1, 2010). SOEPpaper No. 332, Available at SSRN: https://ssrn.com/abstract=1722527 or http://dx.doi.org/10.2139/ssrn.1722527

Download This Paper

Open PDF in Browser


bottom of page