My concern here is that by gravitating to HTML you lose the ability for a human (you!) to easily co-author the document with the LLM. If it’s just an explainer for your consumption, that’s not a concern - but if it’s a spec sheet for something more complex, I deeply value being able to dive in and edit what is produced for me. With a HTML doc it is much harder to do that than with MD.
Now of course you could just reprompt your LLM to change the HTML - but when I already have a clear idea of what I want to say in my head, that’s just another roadblock in the way.
If this pattern becomes more common I suspect human/LLM co-creation will further dwindle in favour of just delegating voice, tone and content choice to the LLM. I was surprised not to see this concern in the blog post’s FAQ.
We have been authoring HTML by hand for decades with ease. Text editors are very good at it, and many have commands to auto-wrap, auto-close etc. Reading and writing is simple.
Templated though, not manually writing it out for every blog post say. I think GP means it just has more friction as a writing format than markdown for example.
I suppose that only applies if you constrain yourself to a raw teletypewriter emulator… in any proper coding environment, editing HTML should be absolutely simple - even an embedded WYSIWYG editor would be an option if rich model output is a way we head into.
A counter argument would be that all programming languages of the last decades have been plain text based. No other more structured format has ever gained traction even though modern editors could be argued to be able to support that easily. Turns out, it doesn't actually work that way.
But we’re not even dealing with a programming language in any classical sense here. Interacting with an LLM coding system is a multi-mode communication system with on-demand, purpose-generated ephemeral UI. That doesn’t fit any of the established categories, so I think carrying over constraints from them doesn’t make sense either.
Most people edit documents in Microsoft word, though, so it didn’t seem too far fetched that LLM content would be edited similarly, especially as more and more non-programmers use it.
Not sure how you use CC, but the last 6 months has felt like significant optimization efforts to me. Last year Claude would just read and edit files, now it's all kinds of basic tool gymnastics with grep/awk/sed/etc to narrowly slice and avoid token-heavy reads. Resuming sessions that aren't even that large get a scary prompt about using a significant portion of your token budget if you continue without compacting.
To me it feels like a worse experience, and they probably feel it too, but it makes sense from an optimization perspective. I've probably learned some shell tricks, but also going blind from watching Claude try dozens of variations of some multi-line chained and piped wall of bash nightmare, instead of just reading a few files.
I've noticed that's changed over the past month or so. Claude-code used to happily pipe build commands straight into context, but recently it's been running them as background tasks that pipe to file, and it'll search and do partial reads on the output instead.
It also gives tips on reducing context size when you run /context .
Presumably they are actually starting to feel the pinch on inference costs themselves with what still feels like a fairly generous max plan.
And it seems to use head, tail etc. more than it used to, even when unnecessary, which, combined with the recent(?) tendency of more chaining and as you said, piping to temp files and the like, totally screwed up claude code’s auto approval system for me (by auto approval I mean the system to decide which commands can be run without permission prompt, based on the permissions.allow setting among other things, not to be confused with a specific new approval mode called “auto” that burns more tokens to decide whether the command is safe). I had to write my own auto approval system and plug it in as a hook.
It depends what we mean I guess, isn’t Markdown supposed to allow [hx]ml tags anyway if user need them? Then it’s more about asking the LLM to generate Markdown with this in consideration, and privilege rendering the output of reports in the preferred browser after relevant rendering.
1. I believe many applications that use markdown allow html. Others don't due to security/rendering issues.
2. One of the limiting factors of LLM is context. An html table takes up way more tokens than a markdown table. Especially if it's a WYSIWYG editor that has all kinds of css and <span> tags just for fun.
When exploring a new idea or tool, my go to prompt is
```
In a single index.html, no dependencies, sparse styling, create an app that <idea>
```
Even before AI, it's how I built small tools, and there's something lovely about being able to email my friends the tool, and tell them "If you want to make a change, toss it to your LLM!"
It is incredible how far you get with a single HTML-file, containing styles and JS, when building dashboards, small apps and other utilities that can interact with an API or otherwise fetch data from somwhere.
I just drop it on my personal ~ folder on the shared server at work and voilà, everyone can check it out and use it immediately!
Same. When I iterate on a design for a new client, I create a simple index.html with inline CSS and when I'm satisfied with the result I take the file and insert it next to my project template files and just ask the LLM to take the design from index.html and work it in the template files.
I never really thought about this use case before, and I feel a bit dumb because of that. It’s so obvious that it would be useful. My focus with LLMs so far has been on The App, not all the ancillary stuff around The App. All that ancillary stuff doesn’t have to be fully complete or polished and doesn’t have to handle every possible case. It just needs to be functional enough to be useful. When you’re done with it, throw it away and generate a new one tomorrow.
The unreasonable effectiveness of the most popular "programming" language used by the most people on Earth ever that has been careful crafted by extremely dedicated people for decades and that is use to communicate by, to and from the most people that ever existed (because yes, most "apps", on mobile at least but even more and more on desktop are just HTML pages too).
I always worry about posting comments like this. A lot of tech bros are like "form logical arguments about why this clickbait isn't clickbait otherwise you are a troll". You beat me too it. Thanks for your service.
I also love how maybe 10% of the posts on here are like "some thing that has existed for 40+ years ,/:/; Claude/chatgpt". So much advertising at the expense of a dead internet it's ridiculous.
Web technologies got so many things right. People complain about it so much but it's amazing.
I worked with a vibe coded app at my last job (and since quit due to it) and because it was a nextjs SPA frontend with a separate API backend, the user facing urls didn't match the backend endpoints. Because AI uses react hooks for everything, state is in-memory, url-based routing isn't a thing unless you design for it. So links aren't free and thus we have no way for users to link to anything other than top-level entry points. LINKS! Especially for internal tools, everything being linkable is vital to collaboration and problem solving.
The need for uniform resource locations and verbs was so well thought out, 30 or 40 some odd years ago.
it does, it's a nextjs app so there's a bunch of top level pages that work as you'd expect. The issue is that we're trying to be a real company, those pages have nested logic, multiple views, sub navigation, search, filtering, wizards, all of which are in a cascade of react hooks because AI is obsessed with self-contained hooks. none of that state is linkable.
What i learned, by fire, is that nextjs works well with its nested routing and layout-to-hold-context magic. it's quite nifty but it's not free and not obvious, you have to design for it. I can't state enough that AI will ship a billion hooks, it loves hooks for self-contained state. My colleague made a good point that that style is a feature; it limits blast radius of N features all building on top of each other.
A couple of tradeoffs I don't see mentioned here for HTML vs MD:
- HTML is significantly less token-efficient
- Difficult to provide precise feedback on plans HTML, much easier to do this in MD.
Both of these tradeoffs set Anthropic up for success. Using HTML as our medium will increase token usage, and I'd bet they're investing in tools to mark up HTML (part of Claude Design) which will help improve lock-in. Either coincidence or brilliant strategy.
> I’ve started preferring HTML as an output format instead of Markdown and increasingly see this being used by others on the Claude Code team, this is why.
This is why I read long agent output either by using VIM and MacOS Quicklook (with a markdown extension for rendering) or paste output into MarkEdit (an editor with a preview pane; I think it’s cross platform?). Worst case, have an agent build you a simple local web page that interprets Markdown and renders it. Markdown was invented as a shorthand for web syntax[0]. That’s what it’s for! I bet you spend more tokens and time asking an agent to convert its native markdown to html than any of these.
Both the original Markdown spec [1] as well as CommonMark [2] clearly specify support for inline HTML. With that you can kind of get the best of both words depending on your use case.
For the most parts you just write the regular Markdown headers and paragraphs, embed images, insert tables etc without the need for any HTML tags, making it readable in source form. And if you want to embed an SVG file for example, which the author of the article mentions as one use case, you just embed the SVG directly, and people can render the Markdown in their favorite viewer.
Let's say you're viewing a raw Markdown file in VS Code. You come onto an HTML tag, so you hit Cmd+Shift+V to open the preview and that's it.
Of course for full-fledged web pages with interactive buttons and fully customized styling and all of that, which the author shows in some examples, this is not feasible. But you can get very far when you have mostly text/images/tables and just want to add some extras here and there.
The problem I have with this shift is that you basically give up on writing documentation by hand, not only for you, but for everybody who might want to edit your documentation.
I'm also not convinced that it solves meaningful problems :
> I've found I tend to not actually read more than a 100-line markdown file, and I certainly am not able to get anyone else in my organization to read it. But HTML documents are much easier to read,
In my experience, LLMs are not concise when it comes to documentation, so using an easier file format to read because the text is too large sounds to me like solving the wrong problem
> Markdown files are fairly hard to share since most browsers do not render them natively well.
Yes, but developers tools (IDE, Git forge) do. What most/some of them don't render natively IS HTML.
Firstly I think this is a super interesting approach to a real problem. I think it offers a lot. I wonder about frustrations around consistency etc though. Have you had any?
I have been building a slightly different solution to the same problem. So far I’m pretty happy with the results and I have enough returning users that I think others are too (https://sdocs.dev/analytics).
SDocs is cli (`sdoc file.md`) -> instantly rendered Markdown file in the browser
When you install the cli it gives you the option to add a note in your base agent file (`~/.claude/CLAUDE.md`, etc.). This means every agent chat knows about SDocs and you can say “sdoc me the plan when you’re done with it” and the file will pop open instead of you having to find that terminal session to know it’s done.
Going browser first means you’re not required to install anything to get a great experience.
Despite being in the browser, the content of SDocs rendered Markdown files remain entirely local to you. SDoc urls contain your markdown document's content in compressed base64 in the url fragment (the bit after the `#`):
The sdocs.dev webapp is purely a client side decoding and rendering engine for the content stored in the url fragment.
This also means you can share your .md files privately by sharing the url.
To go a bit deeper towards the HTML side of things I’ve added tagged code blocks that the agent is given documentation on how to use. Eg ```chart or ```mermaid (for mermaid diagrams). These then become interactive elements on the page (mermaid is best example of this currently)
For similar reasons, I strongly prefer org-mode to markdown. I find that with org-mode and extensions (such as in-line elisp) I have a _significantly_ more powerful system. For example, specs can have tasks and roadmaps inline which reduces risk of drift. The biggest downside is, unfortunately, not enough folks are emacs proficient.
I hadn't considered HTML and I'm definitely going to try this.
I moved from Markdown to JSON for all spec writing about 9 months ago. Although not HTML, it still has the same benefits. Claude and the other models are just so much more reliable in a structured format like JSON/HTML/XML.
The most important thing is that I can run static analysis on a structured format. This is important even for my spec documents. I can write data fields and have static analysis analyze it. For example, to confirm database fields match across various spec documents, etc.. The static analysis is also why you use JSON/XML instead of HTML, since you can now have your own custom schema.
Also don't use YAML, as that's far more unreliable. (If you chop a YAML file in half, it's still valid)
I think this is super interesting, but i think you and the OP is talking about two different problems: presenting text to end users and structuring text for agents
Or, you could do what I did, and process your markdown using Typst. The result is consistent, beautifully formatted documents, generated from your markdown, with mermaid diagrams.
I was running into issues with markdown and mermaid the other day and decided to tell Claude to use HTML, then realized that opened up the option of JS as well. Given we were trying to model state management in an existing typescript application, we could use ts-morph to correctly map out the call sites within different components, even use math to determine various factors of refactoring or relocating and so on.
It ended up being a large document but it was infinitely more useful than plain old markdown. I think I’ll do it more often now. Creating interactive specs is a really fascinating way to work. I’ve always verified my work in similar ways, but this was more… Verification-forward, I guess. And it took hours instead of days.
It’s actually really fun to be building and refactoring these days. No idea how long the fun will last, but I’m thoroughly enjoying it right now. I know it will make some people think I’m a hack, but I know that approach enabled me to do a faster, more targeted, ultimately better job of improving state management in the application. It’s not because Claude wrote the code (I still did quite a bit of it), but because Claude helped me establish the right work to do for the right reasons, using tools I couldn’t dream of throwing together 5 years ago.
I don’t think the idea of generating HTML is a novelty. Anyone using an LLM to create a web app has done that. Any “novelty” here is the idea of favoring HTML rather than Markdown for internal docs like specs, design docs, etc. Maybe you were already doing that. I sure haven’t been. In hindsight it’s obvious that it would be useful in some circumstances. Previously, I’ve had a bias against HTML because it’s annoying to read in a text editor.
I'm thinking of adding support for GitHub-flavored markdown (including things like Mermaid diagrams) in my agent wrapper tool and then adding something like a skill for Claude Code to always write GitHub-flavored markdown and use its features when communicating with me. Seems a lot better than general Markdown.
Though now I'm wondering: why not just add full HTML embedding support as well? I'm talking not just for specific deliverables, but for any of the agent's responses with the user.
Have we gone full circle? From the invention of HTML to the rebirth of plain text with MD to the rediscovery of HTML anbrhe age of LLM?
The facts that this article needs to list the pros of HTML over MD, like inreraction, visual density, etc, is weird to me. Maybe the ahdience is not tech-savvy people but I read it as an unnecessary word salad.
It feels like a word salad that's been created to hit a word target because it's also LLM generated. Really not worth spending 5-10 minutes reading some "AI" output.
I’ve been prompting my way to all kinds of interactive HTML artifacts the last month or so. It’s way more fun than making decks and static documentation.
I even did a workshop with PartyKit cursors, dot voting, reflection comments, and an individual rating at the end.
Oh, and I can add super lightweight analytics so I know who actually reads my things or interacts with my prototypes. ^_^
Just copy and paste the parent post into $yourFavouriteLLM and ask it to make a blog post for you to read about it. No need to ask the author to do more work than is necessary /s
We’ve been doing this for a couple of months at work for internal memos and decision records, it’s really powerful. I love being able to drop in interactive visuals and more dynamic content. We have a Cloudflare R2-backed Document Store for managing them, and CLI for publishing, and I’m working on an MCP server for long-term discovery and context.
My team kept asking if they could leave comments though, so I built Annotent [1] to help with that, which is also MCP-backed.
Claude understands AsciiDoc just fine. His training data makes it so that most AsciiDoc documents he generates uses Markdown-like capabilities, but a gentle push in an appended system prompt (like a “feel free to use all of AsciiDoc’s features”) makes him create very nice documents to iterate on, in my opinion.
I've been struggling with markdown recently as I really want the claims it makes in documents to be programmatically verifiable e.g. citations, I want a simple script to check that each of the files and lines of code it references actually exist. Perhaps HTML can be used for this. It certainly has a better chance than markdown.
Do we have local first html renderers that don’t complain about cors and wrong file addresses? I don’t want to spin up a server just to open an HTML file
Yeah, I agree with this. I've been doing the same thing. Whenever I have to do a review, I ask the llm to create a dashboard. It's a godsend for reducing cognitive burden.
I think the reason stuff like this wasn't done earlier was due to fears about context pollution, but post training has gotten so good that you can do virtually anything in the context window and not have it affect the quality of output.
> I find it difficult to read a markdown file of more than a hundred lines.
Apologies if I'm slightly demeaning here, but what? Markdown is largely plaintext. Unless the output is completely littered with markdown formatting (specifically, imo, tables and maybe links would be the hardest for a human to parse), is this not just saying "I have a difficult time reading large bodies of text" (which is of course fine - people prefer different ways ingesting information)
From reading the post, this seems like a "I personally prefer more visual, interactive elements in output" rather than "agents using markdown leads for less understandable output", and not at all what I would recommend for the average person to use unless said agent or interface or whatever had an extremely seamless way of displaying the HTML. If I had to open the agents response in a browser every time I wanted anything detailed I would lose my mind.
I've certainly had agents generate more rich visualizations such as through html, but I can't imagine using that as my default.
It's been confusing to me that so many people have treated markdown as the lingua franca for agent instructions when their training corpus must be dramatically biased to HTML instead of Mardown.
Markdown only makes sense for us meatbags becuse it's easy for us to edit and version control, but if you're sharing anything where the audience is an agent publicly, HTML must be just as interpretable.
~html has more capabilities than markdown~ the real title
Weren’t llms specifically originally set to output and print markdown format since it is simpler and easier everyone to read? No different rendering/libraries/apis to worry about…
I guess the author has never heard of Markdown editors with a preview feature, and doesn't know that the Claude Code VS Code plugin opens plans in preview mode.
I've felt the opposite. With MD being the preferred planning+documentation output for LLMs, the time of "hype" seems to be now. It seemed just a few years ago that devs hated writing properly formatted markdown.
Feels like markdown is token-efficient since its abbreviating HTML tags. Think about a large table. Also, HTML can't be especially rendered inside, say, Neovim, so now interacting with an LLM interface gravitates towards GUIs. Good for accessibility, I guess, there's that. Maybe the need for token efficiency will finally realize my dream of writing Slim[0], or some such, straight to the browser, full native compat.
> I find it difficult to read a markdown file of more than a hundred lines.
This is the sort of observation that should make you concerned about your own skills, rather than spur you to generate an article (longer than you can read easily) trying to give advice to other people.
The linked article could have been made using Markdown, and yet the author still chose to represent themselves this way; there's no point changing the tool when the problem is the user.