"The Frankenstein dataset, born from undocumented modifications to a rigorous, peer-reviewed dataset, highlights a crisis in scientific integrity. Its story is not just a cautionary tale about bad data but a case study in how the scientific process can fail when institutional accountability is lacking. Let’s unpack how this dataset came to be, why it matters, and what it reveals about the state of climate science.
The Origins of the Original Dataset
The original dataset, as developed through decades of research by Pielke and his colleagues, aimed to normalize hurricane damages by adjusting for inflation, population growth, and other economic factors. This normalization process allowed researchers to compare historical hurricane damages on an apples-to-apples basis, isolating trends in economic loss from changes in wealth or development.
The dataset, thoroughly documented in studies like Weinkle et al. (2018) and Pielke et al. (2008), served as a reliable tool for understanding hurricane impacts. It was grounded in NOAA’s “best track” data, covering U.S. landfalling hurricanes, and adhered to consistent methodologies.
However, the story took a dark turn when this dataset fell into the hands of ICAT, an insurance company.
How the Frankenstein Dataset Was Born
After Pielke et al. (2008) was published, Pielke’s team partnered with ICAT to create the ICAT Damage Estimator, an online tool designed to visualize hurricane damages using the peer-reviewed dataset. Initially, the collaboration worked as intended: the tool increased access to high-quality research for industry stakeholders.
But in 2010, ICAT was acquired by another company, and Pielke ceased his involvement. Over the following years, ICAT employees, who lacked expertise in disaster normalization, made undocumented changes to the dataset. These alterations included replacing post-1980 entries with data from NOAA’s Billion-Dollar Disasters (BDD) database, which utilized a completely different methodology.
Key Modifications
Substitution with NOAA’s BDD Data: ICAT replaced post-1980 entries with BDD data, which included inland flooding damages (from the National Flood Insurance Program, or NFIP) and broader economic impacts like commodity losses and disaster relief payouts. These additional factors inflated damage estimates post-1980, creating an artificial upward trend.
Additional Events: ICAT17, the modified dataset, introduced 61 additional storm damage events, none of which were sourced or documented. Most of these undocumented events occurred after 1980, further skewing the dataset.
Methodological Discontinuity: NOAA’s BDD methodology, adopted in 2016, was incompatible with the original dataset. For example, NFIP payouts didn’t exist before 1968, making comparisons between pre- and post-1968 damages inherently flawed.
Unsupervised Alterations: Beyond substituting BDD data, ICAT17 contained additional undocumented changes to the original dataset. These changes introduced upward biases even before normalization adjustments were applied."
"How the Frankenstein Dataset Was Misused
The ICAT17 dataset, later extended and rebranded as “XCAT/ICAT 23 in Willoughby et al 2024,” was adopted by researchers who assumed it was a professionally maintained and credible resource. Notably:
Grinsted et al. (2019) and Willoughby et al. (2024) used XCAT to claim an upward trend in normalized U.S. hurricane damages, attributing this trend to climate change.
These studies were published in prominent journals like PNAS and JAMC and subsequently cited in influential reports, including the IPCC’s AR6.
However, Pielke’s analysis reveals that these trends vanish when the original dataset (Weinkle et al. 2018) is used instead of XCAT/ICAT23. In other words, the upward trends claimed in these studies are entirely a product of flawed data practices."
Taken from -
https://wattsupwiththat.com/2024/12/23/frankenstein-datasets-and-the-crisis-in-c...