"Ordinary users don’t want to learn about the relative strengths and weaknesses of various products like Operator and Deep Research. They just want to ask ChatGPT a question and have it figure out the best way to answer it.
It’s a promising idea, but how well does it work in practice? On Friday, I asked ChatGPT Agent to perform four real-world tasks for me: buying groceries, purchasing a light bulb, planning an itinerary, and filtering a spreadsheet.
I found that ChatGPT Agent is dramatically better than its predecessor at grocery shopping. But it still made mistakes at this task. More broadly, the agent is nowhere close to the level of reliability required for me to really trust it.
And as a result I doubt that this iteration of computer-use technology will get a lot of use. Because an agent that frequently does the wrong thing is often worse than useless."
https://www.understandingai.org/p/chatgpt-agent-a-big-improvement-but
