Robocall prevention using automated call content analysis

Researchers at North Carolina State University have published a paper on a new robocall identification tool they have developed to automate call content analysis. Let’s have a look.

SnorCall, a new robocall analysis tool

The authors of this paper are Bradley Reaves, Sathvik Prasad, Trevor Dunlap, and Alexander Ross.

These researchers developed a system they call “SnorCall.” They use it to transcribe and analyze recorded robocalls. One of the components of the system is Snorkel, an open-source Python library for programmatically building training datasets for natural language processing—hence the name SnorCall.

The SnorCall system performs five processes:

  1. Audio clustering — deduplicate recorded robocall audio messages
  2. Speech-to-text transcription — get the recorded message into text that can be analyzed
  3. Language filter — skip messages that were not in English
  4. Text analysis — identify keywords, call to action, callback number
  5. Labeling — categorize robocalls.

Robocall study

The researchers used their SnorCall system to analyze the robocalls they captured. Here are some statistics for their study:

  • Their honeypot had 5,949 telephone numbers
  • The study ran from January 1, 2020, to November 30, 2021
  • The honeypot received 1,355,672 calls
  • 371,045 calls (27.37%) contained sufficient audio for further analysis
  • 26,791 calls were transcribed into text.


One of the interesting findings is that many robocalls included a callback number. The researchers speculate that, because subscribers aren’t answering calls from unknown callers, robocallers are putting a callback number in their recorded messages. Here are the callback number statistics:

  • Callback numbers were used in 45% of robocalls.
  • These numbers were the only method of interaction in 17% of robocalls.
  • Only 4.23% of callback numbers matched caller ID.
  • 27.32% of callback numbers were used among different robocall campaigns.
  • Callback numbers had a median lifespan of just eight days.

Of course, robocallers often spoof their calling number, but the callback number is real. These numbers can be used to identify the carrier that created an account and provided a telephone number for the robocaller.


The researchers suggested the following uses for automated call content analysis in robocall identification and prevention:

  • Automatically analyze thousands of robocalls
  • Can be used by frequent targets to warn their customers
  • Enables regulators and law enforcement agencies to proactively uncover malicious robocall campaigns and prioritize response.
  • Enables carriers to monitor malicious robocalls targeting their subscribers and engage upstream providers responsible for the traffic.
  • Callback number extraction enables investigators to identify the originators of unlawful robocalls.

More information

a magnifying glass over data

TransNexus solutions

TransNexus is a leader in developing innovative software to manage and protect telecommunications networks. The company has over 20 years’ experience in providing telecom software solutions including toll fraud prevention, robocall mitigation and prevention, TDoS prevention, analytics, routing, billing support, STIR/SHAKEN and SHAKEN certificate services.

Contact us today to learn more.

Request information

* required

This information will only be used to respond to your inquiry. TransNexus will not share your data with any third parties. We will retain your information for as long as needed to retain a record of your inquiry. For more information about how we use personal data, please see our privacy statement.