Marker is a tool that converts PDF, EPUB, and MOBI files to markdown. It boasts being 10 times faster than nougat and more accurate on most documents. It is designed to remove headers, footers, and other artifacts, convert equations to latex, format code blocks and tables, and support multiple languages. The tool relies on deep learning models to extract text, detect page layout, clean and format blocks, and postprocess the complete text. Marker is known for its speed and low hallucination risk. The content provides instructions on installation and usage, as well as benchmarks and limitations of the tool.
https://github.com/VikParuchuri/marker