********** UNIVERSITY **********
return to top
How an Electrical Engineer Solved Australia’s Most Famous Cold Case
Mon, 20 Mar 2023 15:00:02 +0000
Dead, and in a jacket and tie. That’s how he was on 1 December 1948, when two men found him slumped against a retaining wall on the beach at Somerton, a suburb of Adelaide, Australia.
The Somerton Man’s body was found on a beach in 1948. Nobody came forward to identify him. JAMES DURHAM
Police distributed a photograph, but no one came forward to claim the body. Eyewitnesses reported having seen the man, whom the newspapers dubbed the Somerton Man and who appeared to be in his early 40s, lying on the beach earlier, perhaps at one point moving his arm, and they had concluded that he was drunk. The place of death led the police to treat the case as a suicide, despite the apparent lack of a suicide note. The presence of blood in the stomach, a common consequence of poisoning, was noted at the autopsy. Several chemical assays failed to identify any poison; granted, the methods of the day were not up to the task.
There was speculation of foul play. Perhaps the man was a spy who had come in from the cold; 1948 was the year after the Cold War got its name. This line of thought was strengthened, a few months later, by codelike writings in a book that came to be associated with the case.
These speculations aside, the idea that a person could simply die in plain view and without friends or family was shocking. This was a man with an athletic build, wearing a nice suit, and showing no signs of having suffered violence. The problem nagged many people over the years, and eventually it took hold of me. In the late 2000s, I began working on the Somerton Man mystery, devoting perhaps 10 hours a week to the research over the course of about 15 years.
Throughout my career, I have always been interested in cracking mysteries. My students and I used computational linguistics to identify which of the three authors of The Federalist Papers—Alexander Hamilton, James Madison, and John Jay—was responsible for any given essay. We tried using the same method to confirm authorship of Biblical passages. More recently, we’ve been throwing some natural-language processing techniques into an effort to decode the Voynich Manuscript, an early 15th-century document written in an unknown language and an unknown script. These other projects yield to one or another key method of inquiry. The Somerton Man problem posed a broader challenge.
My one great advantage has been my access to students and to scientific instruments at the University of Adelaide, where I am a professor of electrical and electronic engineering. In 2009, I established a working group at the university’s Center for Biomedical Engineering.
One question surrounding the Somerton Man had already been solved by sleuths of a more literary bent. In 1949, a pathologist had found a bit of paper concealed in one of the dead man’s pockets, and on it were printed the words Tamám Shud, the Persian for “finished.” The phrase appears at the end of Edward FitzGerald’s translation of the Rubáiyát of Omar Khayyám, a poem that remains popular to this day.
The police asked the public for copies of the book in which the final page had been torn out. A man found such a book in his car, where apparently it had been thrown in through an open window. The book proved a match.
The back cover of the book also included scribbled letters, which were at first thought to constitute an encrypted message. But statistical tests carried out by my team showed that it was more likely a string of the initial letters of words. Through computational techniques, we eliminated all of the cryptographic codes known in the 1940s, leaving as a remaining possibility a one-time pad, in which each letter is based on a secret source text. We ransacked the poem itself and other texts, including the Bible and the Talmud, but we never identified a plausible source text. It could have been a pedestrian aide-mémoire—to list the names of horses in an upcoming race, for example. Moreover, our research indicates that it doesn’t have the structural sophistication of a code. The Persian phrase could have been the man’s farewell to the world: his suicide note.
Also scribbled on the back cover was a telephone number that led to one Jo Thomson, a woman who lived merely a five-minute walk from where the Somerton Man had been found. Interviewers then and decades later reported that she had seemed evasive; after her death, some of her relatives and friends said they speculated that she must have known the dead man. I discovered a possible clue: Thomson’s son was missing his lateral incisors, the two teeth that normally flank the central incisors. This condition, found in a very small percentage of the population, is often congenital; oddly, the Somerton Man had it, too. Were they related?
And yet the attempt to link Thomson to the body petered out. Early in the investigation, she told the police that she had given a copy of the Rubáiyát to a lieutenant in the Australian Army whom she had known during the war, and indeed, that man turned out to own a copy. But Thomson hadn’t seen him since 1945, he was very much alive, and the last page of his copy was still intact. A trail to nowhere, one of many that were to follow.
We engineers in the 21st century had several other items to examine. First was a plaster death mask that had been made six months after the man died, during which time the face had flattened. We tried several methods to reconstruct its original appearance: In 2013 we commissioned a picture by Greg O’Leary, a professional portrait artist. Then, in 2020, we approached Daniel Voshart, who designs graphics for Star Trek movies. He used a suite of professional AI tools to create a lifelike reconstruction of the Somerton Man. Later, we obtained another reconstruction by Michael Streed, a U.S. police sketch artist. We published these images, together with many isolated facts about the body, the teeth, and the clothing, in the hope of garnering insights from the public. No luck.
As the death mask had been molded directly off the Somerton Man’s head, neck, and upper body, some of the man’s hair was embedded in the plaster of Paris—a potential DNA gold mine. At the University of Adelaide, I had the assistance of a hair forensics expert, Janette Edson. In 2012, with the permission of the police, Janette used a magnifying glass to find where several hairs came together in a cluster. She was then able to pull out single strands without breaking them or damaging the plaster matrix. She thus secured the soft, spongy hair roots as well as several lengths of hair shaft. The received wisdom of forensic science at the time held that the hair shaft would be useless for DNA analysis without the hair root.
Janette performed our first DNA analysis in 2015 and, from the hair root, was able to place the sample within a maternal genetic lineage, or haplotype, known as “H,” which is widely spread around Europe. (Such maternally inherited DNA comes not from the nucleus of a cell but from the mitochondria.) The test therefore told us little we hadn’t already known. The concentration of DNA was far too low for the technology of the time to piece together the sequencing we needed.
Fortunately, sequencing tools continued to improve. In 2018, Guanchen Li and Jeremy Austin, also at the University of Adelaide, obtained the entire mitochondrial genome from hair-root material and narrowed down the maternal haplotype to H4a1a1a.
However, to identify Somerton Man using DNA databases, we needed to go to autosomal DNA—the kind that is inherited from both parents. There are more than 20 such databases, 23andMe and Ancestry being the largest. These databases require sequences of from 500,000 to 2,000,000 single nucleotide polymorphisms, or SNPs (pronounced “snips”). The concentration levels of autosomes in the human cell tend to be much lower than those of the mitochondria, and so Li and Austin were able to obtain only 50,000 SNPs, of which 16,000 were usable. This was a breakthrough, but it still wasn’t good enough to work on a database.
In 2022, at the suggestion of Colleen Fitzpatrick, a former NASA employee who had trained as a nuclear physicist but then became a forensic genetics expert, I sent a hair sample to Astrea Forensics, a DNA lab in the United States. This was our best hair-root sample, one that I had nervously guarded for 10 years. The result from Astrea came back—and it was a big flop.
Seemingly out of options, we tried a desperate move. We asked Astrea to analyze a 5-centimeter-long shaft of hair that had no root at all. Bang! The company retrieved 2 million SNPs. The identity of the Somerton Man was now within our reach.
So why did the rootless shaft work in our case?
The DNA analysis that police use for standard crime-solving relies on only 20 to 25 short tandem repeats (STRs) of DNA. That’s fine for police, who mostly do one-to-one matches to determine whether the DNA recovered at a crime scene matches a suspect’s DNA.
But finding distant cousins of the Somerton Man on genealogical databases constitutes a one-to-many search, and for that you typically need around 500,000 markers. For these genealogical searches, SNPs are used because they contain information on ethnicity and ancestry generally. Note that SNPs have around 50 to 150 base pairs of nucleotides, whereas typical STRs tend to be longer, containing 80 to 450 base pairs. The hair shaft contains DNA that is mostly fragmented, so it’s of little use when you’re seeking longer STR segments but it’s a great source of SNPs. So this is why crime forensics traditionally focused on the root and ignored the shaft, although this practice is now changing very slowly.
Another reason the shaft was such a trove of DNA is that keratin, its principal component, is a very tough protein, and it had protected the DNA fragments lodged within it. The 74-year-old soft spongy hair root, on the other hand, had not protected the DNA to the same extent. We set a world record for obtaining a human identification, using forensic genealogy, from the oldest piece of hair shaft. Several police departments in the United States now use hair shafts to retrieve DNA, as I am sure many will start to do in other countries, following our example.
Libraries of SNPs can be used to untangle the branching lines of descent in a family tree. We uploaded our 2 million SNPs to GEDmatch Pro, an online genealogical database located in Lake Worth, Fla. (and recently acquired by Qiagen, a biotech company based in the Netherlands). The closest match was a rather distant relative based in Victoria, Australia. Together with Colleen Fitzpatrick, I built out a family tree containing more than 4,000 people. On that tree we found a Charles Webb, son of a baker, born in 1905 in Melbourne, with no date of death recorded.
Charles never had children of his own, but he had five siblings, and I was able to locate some of their living descendants. Their DNA was a dead match. I also found a descendant of one of his maternal aunts, who agreed to undergo a test. When a positive result came through on 22 July 2022, we had all the evidence we needed. This was our champagne moment.
In late 2021, police in South Australia ordered an exhumation of the Somerton Man’s body for a thorough analysis of its DNA. At the time we prepared this article, they had not yet confirmed our result, but they did announce that they were “cautiously optimistic” about it.
All at once, we were able to fill in a lot of blank spaces. Webb was born on 16 November 1905, in Footscray, a suburb of Melbourne, and educated at a technical college, now Swinburne University of Technology. He later worked as an electrical technician at a factory that made electric hand drills. Our DNA tests confirmed he was not related to Thomson’s son, despite the coincidence of their missing lateral incisors.
We discovered that Webb had married a woman named Dorothy Robinson in 1941 and had separated from her in 1947. She filed for divorce on grounds of desertion, and the divorce lawyers visited his former place of work, confirming that he had quit around 1947 or 1948. But they could not determine what happened to him after that. The divorce finally came through in 1952; in those days, divorces in Australia were granted only five years after separation.
At the time of Webb’s death his family had become quite fragmented. His parents were dead, a brother and a nephew had died in the war, and his eldest brother was ill. One of his sisters died in 1955 and left him money in her will, mistakenly thinking he was still alive and living in another state. The lawyers administering the will were unable to locate Charles.
We got more than DNA from the hair: We also vaporized a strand of hair by scanning a laser along its length, a technique known as laser ablation. By performing mass spectrometry on the vapor, we were able to track Webb’s varying exposure to lead. A month before Webb’s death, his lead level was high, perhaps because he had been working with the metal, maybe soldering with it. Over the next month’s worth of hair growth, the lead concentration declined; it reached its lowest level at his death. This might be a sign that he had moved.
With a trove of photographs from family albums and other sources, we were able to compare the face of the young Webb with the artists’ reconstructions we had commissioned in 2013 and 2021 and the AI reconstruction we had commissioned in 2020. Interestingly, the AI reconstruction had best captured his likeness.
A group photograph, taken in 1921, of the Swinburne College football team, included a young Webb. Clues found in newspapers show that he continued to participate in various sports, which would explain the athletic condition of his body.
What’s interesting about solving such a case is how it relies on concepts that may seem counterintuitive to forensic biologists but are quite straightforward to an electronics engineer. For example, when dealing with a standard crime scene that uses only two dozen STR markers, one observes very strict protocols to ensure the integrity of the full set of STRs. When dealing with a case with 2 million SNPs, by contrast, things are more relaxed. Many of the old-school STR protocols don’t apply when you have access to a lot of information. Many SNPs can drop out, some can even be “noise,” the signal may not be clean—and yet you can still crack the case!
Engineers understand this concept well. It’s what we call graceful degradation—when, say, a few flipped bits on a digital video signal are hardly noticed. The same is true for a large SNP file.
And so, when Astrea retrieved the 2 million SNPs, the company didn’t rely on the traditional framework for DNA-sequencing reads. It used a completely different mathematical framework, called imputation. The concept of imputation is not yet fully appreciated by forensics experts who have a biological background. However, for an electronics engineer, the concept is similar to error correction: We infer and “impute” bits of information that have dropped out of a received digital signal. Such an approach is not possible with a few STRs, but when handling over a million SNPs, it’s a different ball game.
Much of the work on identifying Charles Webb from his genealogy had to be done manually because there are simply no automated tools for the task. As an electronics engineer, I now see possible ways to make tools that would speed up the process. One such tool my team has been working on, together with Colleen Fitzpatrick, is software that can input an entire family tree and represent all of the birth locations as colored dots on Google Earth. This helps to visualize geolocation when dealing with a large and complex family.
The Somerton Man case still has its mysteries. We cannot yet determine where Webb lived in his final weeks or what he was doing. Although the literary clue he left in his pocket was probably an elliptical suicide note, we cannot confirm the exact cause of death. There is still room for research; there is much we do not know.
This article appears in the April 2023 print issue as “Finding Somerton Man.”
Not all technological innovation deserves to be called progress. That’s because some advances, despite their conveniences, may not do as much societal advancing, on balance, as advertised. One researcher who stands opposite technology’s cheerleaders is MIT economist Daron Acemoglu. (The “c” in his surname is pronounced like a soft “g.”) IEEE Spectrum spoke with Agemoglu—whose fields of research include labor economics, political economy, and development economics—about his recent work and his take on whether technologies such as artificial intelligence will have a positive or negative net effect on human society.
IEEE Spectrum: In your November 2022 working paper “Automation and the Workforce,” you and your coauthors say that the record is, at best, mixed when AI encounters the job force. What explains the discrepancy between the greater demand for skilled labor and their staffing levels?
Acemoglu: Firms often lay off less-skilled workers and try to increase the employment of skilled workers.
“Generative AI could be used, not for replacing humans, but to be helpful for humans. ... But that’s not the trajectory it’s going in right now.”
—Daron Acemoglu, MIT
In theory, high demand and tight supply are supposed to result in higher prices—in this case, higher salary offers. It stands to reason that, based on this long-accepted principle, firms would think ‘More money, less problems.’
Acemoglu: You may be right to an extent, but... when firms are complaining about skill shortages, a part of it is I think they’re complaining about the general lack of skills among the applicants that they see.
In your 2021 paper “Harms of AI,” you argue if AI remains unregulated, it’s going to cause substantial harm. Could you provide some examples?
Acemoglu: Well, let me give you two examples from Chat GPT, which is all the rage nowadays. ChatGPT could be used for many different things. But the current trajectory of the large language model, epitomized by Chat GPT, is very much focused on the broad automation agenda. ChatGPT tries to impress the users…What it’s trying to do is trying to be as good as humans in a variety of tasks: answering questions, being conversational, writing sonnets, and writing essays. In fact, in a few things, it can be better than humans because writing coherent text is a challenging task and predictive tools of what word should come next, on the basis of the corpus of a lot of data from the Internet, do that fairly well.
The path that GPT3 [the large language model that spawned ChatGPT] is going down is emphasizing automation. And there are already other areas where automation has had a deleterious effect—job losses, inequality, and so forth. If you think about it you will see—or you could argue anyway—that the same architecture could have been used for very different things. Generative AI could be used, not for replacing humans, but to be helpful for humans. If you want to write an article for IEEE Spectrum, you could either go and have ChatGPT write that article for you, or you could use it to curate a reading list for you that might capture things you didn’t know yourself that are relevant to the topic. The question would then be how reliable the different articles on that reading list are. Still, in that capacity, generative AI would be a human complementary tool rather than a human replacement tool. But that’s not the trajectory it’s going in right now.
“Open AI, taking a page from Facebook’s ‘move fast and break things’ code book, just dumped it all out. Is that a good thing?”
—Daron Acemoglu, MIT
Let me give you another example more relevant to the political discourse. Because, again, the ChatGPT architecture is based on just taking information from the Internet that it can get for free. And then, having a centralized structure operated by Open AI, it has a conundrum: If you just take the Internet and use your generative AI tools to form sentences, you could very likely end up with hate speech including racial epithets and misogyny, because the Internet is filled with that. So, how does the ChatGPT deal with that? Well, a bunch of engineers sat down and they developed another set of tools, mostly based on reinforcement learning, that allow them to say, “These words are not going to be spoken.” That’s the conundrum of the centralized model. Either it’s going to spew hateful stuff or somebody has to decide what’s sufficiently hateful. But that is not going to be conducive for any type of trust in political discourse. because it could turn out that three or four engineers—essentially a group of white coats—get to decide what people can hear on social and political issues. I believe hose tools could be used in a more decentralized way, rather than within the auspices of centralized big companies such as Microsoft, Google, Amazon, and Facebook.
Instead of continuing to move fast and break things, innovators should take a more deliberate stance, you say. Are there some definite no-nos that should guide the next steps toward intelligent machines?
Acemoglu: Yes. And again, let me give you an illustration using ChatGPT. They wanted to beat Google [to market, understanding that] some of the technologies were originally developed by Google. And so, they went ahead and released it. It’s now being used by tens of millions of people, but we have no idea what the broader implications of large language models will be if they are used this way, or how they’ll impact journalism, middle school English classes, or what political implications they will have. Google is not my favorite company, but in this instance, I think Google would be much more cautious. They were actually holding back their large language model. But Open AI, taking a page from Facebook’s ‘move fast and break things’ code book, just dumped it all out. Is that a good thing? I don’t know. Open AI has become a multi-billion-dollar company as a result. It was always a part of Microsoft in reality, but now it’s been integrated into Microsoft Bing, while Google lost something like 100 billion dollars in value. So, you see the high-stakes, cutthroat environment we are in and the incentives that that creates. I don’t think we can trust companies to act responsibly here without regulation.
Tech companies have asserted that automation will put humans in a supervisory role instead of just killing all jobs. The robots are on the floor, and the humans are in a back room overseeing the machines’ activities. But who’s to say the back room is not across an ocean instead of on the other side of a wall—a separation that would further enable employers to slash labor costs by offshoring jobs?
Acemoglu: That’s right. I agree with all those statements. I would say, in fact, that’s the usual excuse of some companies engaged in rapid algorithmic automation. It’s a common refrain. But you’re not going to create 100 million jobs of people supervising, providing data, and training to algorithms. The point of providing data and training is that the algorithm can now do the tasks that humans used to do. That’s very different from what I’m calling human complementarity, where the algorithm becomes a tool for humans.
“[Imagine] using AI... for real-time scheduling which might take the form of zero-hour contracts. In other words, I employ you, but I do not commit to providing you any work.”
—Daron Acemoglu, MIT
According to “The Harms of AI,” executives trained to hack away at labor costs have used tech to help, for instance, skirt labor laws that benefit workers. Say, scheduling hourly workers’ shifts so that hardly any ever reach the weekly threshold of hours that would make them eligible for employer-sponsored health insurance coverage and/or overtime pay.
Acemoglu: Yes, I agree with that statement too. Even more important examples would be using AI for monitoring workers, and for real-time scheduling which might take the form of zero-hour contracts. In other words, I employ you, but I do not commit to providing you any work. You’re my employee. I have the right to call you. And when I call you, you’re expected to show up. So, say I’m Starbucks. I’ll call and say ‘Willie, come in at 8am.’ But I don’t have to call you, and if I don’t do it for a week, you don’t make any money that week.
Will the simultaneous spread of AI and the technologies that enable the surveillance state bring about a total absence of privacy and anonymity, as was depicted in the sci-fi film Minority Report?
Acemoglu: Well, I think it has already happened. In China, that’s exactly the situation urban dwellers find themselves in. And in the United States, it’s actually private companies. Google has much more information about you and can constantly monitor you unless you turn off various settings in your phone. It’s also constantly using the data you leave on the Internet, on other apps, or when you use Gmail. So, there is a complete loss of privacy and anonymity. Some people say ‘Oh, that’s not that bad. Those are companies. That’s not the same as the Chinese government.’ But I think it raises a lot of issues that they are using data for individualized, targeted ads. It’s also problematic that they’re selling your data to third parties.
In four years, when my children will be about to graduate from college, how will AI have changed their career options?
Acemoglu: That goes right back to the earlier discussion with ChatGPT. Programs like GPT3and GPT4 may scuttle a lot of careers but without creating huge productivity improvements on their current path. On the other hand, as I mentioned, there are alternative paths that would actually be much better. AI advances are not preordained. It’s not like we know exactly what’s going to happen in the next four years, but it’s about trajectory. The current trajectory is one based on automation. And if that continues, lots of careers will be closed to your children. But if the trajectory goes in a different direction, and becomes human complementary, who knows? Perhaps they may have some very meaningful new occupations open to them.
RSS Rabbit links users to publicly available RSS entries.
Vet every link before clicking! The creators accept no responsibility for the contents of these entries.
We're not prepared to take user feedback yet. Check back soon!