Signal has recently open-sourced a SQLite extension that improves support for non-latin languages in the Full-Text Search (FTS) feature. This article provides additional information about the internal structure of SQLite’s FTS implementation for developers interested in delving deeper than the official documentation. It explains how FTS5 is used, introduces Signal’s FTS5 extension for Chinese and Japanese language support, and discusses the internal structure of FTS5. The article highlights the use of B-Trees to improve insertion performance and the process of merging segments for efficient searching. The author acknowledges that there is even more complexity beyond what is covered in this article.
https://darksi.de/13.sqlite-fts5-structure/