As someone who's spent considerable time exploring AI-powered development tools, I recently took Windsurf, Codium's latest IDE release, for a test drive. While it promises to keep developers in a "flow state," my experience revealed both impressive innovations and notable limitations.
You can see the entire video here or look into my notes below if you want a quick summary.
Side note: If you liked seeing how I was able to prompt the AI with just my voice, you might enjoy trying out Wispr Flow. I’ve been using it for a while now and it’s been a huge productivity boost for me. Almost don’t have to type anymore!
Winsurf has a few things that are really going well for it:
I really love how Windsurf handles its ability to run cascades, which are these flows of changes over code.
I like how they show that they're analyzing, diffing, creating and running terminal commands. I think the distinction between those feels very good.
I like how it's able to retry commands when it fails and attempt to keep going even though something didn't happen quite right. It seems to be very advanced in that regard.
Good Workflows
This thing is the best terminal command runner I've used to date. It seems way better than a lot of the other terminal options that I've had before as it has way more context than most terminals.
Unexpectedly, it seemed to work pretty well with two projects open, but it did seem to be mainly because I prompted for one project or another. I worry that if you weren't specific, it may mingle the two.
Github Copilot Level Autocomplete
The Autocomplete is definitely worse than Cursor and feels a lot like Copilot.
I don't get the same tab-tab-tab auto-completion a bunch of code the way that I do when I'm using a cursor.
Bad Workflows
Conversation history is tied to the state of the chat, so there's no going back in time to ask something with just the right context. This severely limits how I generally work with AI models.
Specific tasks seem to work great, but I find the code generation that it does is not very creative. Code generally seems to be a better result when I'm using Cloud directly versus using this tool. Seems to align too closely with my code at times.
It seems like it can run pretty quickly until you've got these big files. Then it seems to really take forever. I'm not sure why. Maybe staying fast just means working with smaller files.
Bad Ergonomics
Asks for permission to run scripts other times it seems to just go wild and create a bunch of stuff. This can feel inconsistent and a bit annoying.
The feeling of waiting really sucks and I'd kind of get bored using it. Waiting for things to complete.
I can’t work while it’s working. When I do try and keep myself busy and make edits during that time, there's a diff which actually derails its ability to generate its own code, which just feels bad.
Adding contexts just feels pretty boring and uninspired. There isn't a preview and it doesn't have as many nice features as Cursor has when you're looking at the context you've included.
Additionally, the context you can include seems pretty limited compared to cursor.
The hotkeys are just a little bit different than cursor and it keeps messing me up. I keep having to hunt for a hotkey, which maybe is just a learning curve, but feels bad.
Summary
Windsurf represents an interesting step forward in AI-powered development environments, particularly in its terminal command handling and error recovery capabilities. However, its promise of maintaining developer flow states falls short for me due to long delays in responses and workflow interruptions.
For specific tasks and smaller projects, particularly those involving complex terminal operations, Windsurf can be a powerful tool. But for larger projects or when you need more creative code generation, you might find yourself reaching for other tools in your development arsenal.