DOM to Semantic Markdown – For LLMs

DOM to Semantic Markdown converts HTML DOM to Semantic Markdown format, optimizing content for Large Language Models (LLMs). It preserves semantic structure, reduces token usage, retains metadata, and enhances processing for LLMs. The tool offers features like AST conversion, main content detection, and URL optimization. It captures semantic tags, image metadata, and table structures. Use cases include Q&A tasks, full-page analysis, and SEO auditing. With browser and Node.js support, the tool provides customizable conversion options. Developers can contribute to the project, which is licensed under MIT. Standout features include URL refification for token efficiency and automatic main content detection.

https://github.com/romansky/dom-to-semantic-markdown

To top