Posted in:
AI Updates

Finding 300+ Plagiarists in 26 Hours With Dead Clicks

//
August 19, 2024
Tyler Einberger
//
Read in
3 min

I tend to find humor in annoying things beyond my control, which is why I needed to write this follow up post.

In my last post, I discussed how dead clicks in Microsoft Clarity provide valuable insights into user behavior and UX. Here’s a twist: Dead click data has also revealed another use case—catching people copying your content to use in generative AI tools like ChatGPT. Is this provable? Not really. But sometimes, you’ve just got to trust your gut—and mine says it’s happening.

What Are Dead Clicks?

Dead clicks occur when users interact with non-clickable elements like text or static images. These usually indicate a UX issue, but sometimes, they reveal something more curious—like people copying your content, possibly feeding it into AI. Read more about what dead clicks are here.

Dead Click Data & Plagiarism

Patterns in dead clicks can suggest content theft. If users consistently highlight and click on large sections of your text, followed by strange inactivity, it’s suspicious. They aren’t citing your work or bookmarking it for later, and no new backlinks appear. My intuition tells me they’re copying the content to paste into ChatGPT or another AI.

Using Dead Click Data to Spot Content Thieves

You can’t confirm plagiarism with absolute certainty, but certain behaviors give it away:

  • Repeatedly highlighting large sections of text.
  • Prolonged inactivity after clicking.
  • Long sessions where users just seem to linger.

These signs suggest they copy your content and run it through AI tools like ChatGPT.

300 Dead-Click Sessions in 26 Hours

In the past 26 hours alone, we’ve logged over 300 dead-click sessions. For a non-major publisher, that’s an incredible amount of copying in such a short period! Some highlights:

  • 81 copies of our latest job description: We were inundated with applications—many clearly “Lazy GPT” responses (credit to Wil Reynolds for the term). Sorting through them was a headache.
  • One user clicked through every single page on our site: They spent nearly 6 hours highlighting and likely copying everything—including the footer. Impressive, in a strange way.
  • Many users spent around 40 minutes copying content: These users pasted large chunks into ChatGPT—the same AI that referred them to our site!

While it may be frustrating, finding humor in it is a better approach.

It's pretty clear this user is actually using ChatGPT to rewrite our entire article, no?

Summary of Findings

Time Period: 26 hours
Total Countries Involved: 51
Estimated Suspected Copying Sessions: 300 (limited)

Here's a breakdown:

Top 10 Countries by Suspected Copying Sessions

  1. United States: 24%
  2. United Kingdom: 11%
  3. India: 10%
  4. Australia: 6%
  5. Germany: 5%
  6. Canada: 3%
  7. Philippines: 3%
  8. Netherlands: 2%
  9. Korea: 2%
  10. Sweden: 2%

The suspected copying behavior spans across 51 countries!

The top 3 countries—US, UK, and India— account for nearly 45% of all suspected copying sessions. The prominence of English-speaking countries means that language accessibility and topical relevance likely play the biggest roles, especially since this list doesn't represent the top users of GenAI tools.

Top 5 Referral Sources For Copying Sessions

  1. Google Organic (US): 41%
  2. ChatGPT: 17%
  3. Bing Organic: 14%
  4. Perplexity AI: 12%
  5. DuckDuckGo: 7%
Traditional search engines account for the majority of suspect traffic. But, GenAI Chatbots contribute a significant 29% of referrals. If anything, this shows us GenAI's growing role in how users discover and interact with content!
Traditional search engines account for the majority of suspect traffic. But, GenAI Chatbots contribute a significant 29% of referrals. If anything, this shows us GenAI's growing role in how users discover and interact with content!

Top 5 Countries That Use ChatGPT (% of all ChatGPT users)

  • India: 10%
  • Brazil: 9%
  • Japan: 7%
  • United States: 6% - this is the only country of significance in my dataset
  • China: 6%

Data accessed: 2024-08-19 | Source: SimilarWeb

Data Limitations

  • The methodology for identifying "suspected" copying isn't verifiable, so this is based on a gut feeling.
  • I also didn't manually review all sessions for confirmation—though I did create a filter that looks to be accurate based on spot checking.
  • Without data from similar websites or past trends, it's hard to gauge if these numbers are unusually high.
  • A 26-hour period is relatively short, so this analysis may not reflect long-term behaviors.

So, What Can You Do About It?

You can only do a little if you want legitimate users to enjoy your content freely. But you can take some creative steps:

  • Plagiarism Counter: Add a playful counter to your site: “This content has been copied into ChatGPT approximately 27 times since May 27, 2024.” You could even offer a reward for reporting plagiarism.
  • JS Script to Deter Copying: You can use JavaScript to make copying more difficult. Just make sure a technical SEO person reviews it to make sure theres no rendering/indexing issues with implementation.

Have more ideas? Please share! I'm on LinkedIn.

Why This Might Be Good for SEO

Interestingly, the content most frequently copied from our site ranks in the top 2 results for several queries. It’s also featured in AI Overviews, suggesting Google recognizes its value.

While plagiarism might correlate with higher rankings, the reason for our success is that the content is genuinely helpful and relevant. If content thieves are targeting you, it’s a sign that you’ve created something valuable.

A former Googler recently confirmed that Google uses click data as a ranking signal. While Google denies that time-on-page directly influences rankings, user behavior data could still matter (see Google Leak). So, plagiarists spending over an hour on your site might help you stay at the top.

They can copy, paste, and even feed it to AI, but they can’t replicate the creativity and originality that makes your content stand out. Let them click and steal—because you’ll always be the source, and they’ll be the echo.

Bar chart showing increase over time with Momentic logo