
Tamara
SparkToro published data showing that AI tool output is so inconsistent that you really can't track rankings in it.
- Used ChatGPT, Claude, and Google Search AIO/AI Mode
- 600 volunteers ran 12 unique prompts (e.g. _What are the top chef’s knives, brand and model, for an amateur home chef with a budget <$300?)_ through each tool a total of 2,961 times then copy/pasted the responses into survey forms
- Someone with mad skillz normalized product & brand results
> To get mathematical about it, *there’s a <1 in 100 chance that ChatGPT or Google’s AI, if asked 100X, will give you the same list of brands in any two responses*. Claude is just slightly more likely to give you the same list twice in a hundred runs, but even less likely to do so in the same order.
> In fact, when it comes to ordering, AI tool responses are so random that it’s more like 1 in 1,000 runs before you’d see two lists in the same order. And we didn’t even try to collect data on how the AIs described each brand or how positive/negative sentiment was around the recommendation.
There is way more to dive into here & plenty of unanswerable questions still.


