TL;DR summary of stories on the internet
The author reflects on the ease of learning CUDA once they discovered it’s essentially C++ with some additional features. However, coming in with C++ habits could lead to suboptimal code, as lessons in memory coalescing reveal. The majority of performance in a modern PC lies in specialized hardware like GPUs, with specialized chips for machine […]
Read more »
This experimental Lisp compiler, written in uLisp, compiles Lisp functions into RISC-V machine code. The compiler can run on a Raspberry Pi Pico 2 or another RP2350-based board. It eliminates the need for a tokenizer or parser, as Lisp programs have a consistent structure. The compiler uses Common Lisp subset supported by uLisp and RISC-V […]
The author dives deep into running Llama locally with minimal dependencies, revealing surprising details often hidden behind APIs like Ollama and Hugging-Face’s transformers package. The setup involves downloading model weights and installing torch, fairscale, and blobfile. The author provides insight into the technical overview, dependencies, beam-search implementation, and performance notes on running models on various […]
Large language models (LLMs) have been relying on decoder-only transformer architectures, leading to significant GPU memory demands due to the retained keys/values information for historical tokens. In response to this issue, Anchor-based LLMs (AnLLMs) have been introduced, utilizing an innovative anchor-based self-attention network (AnSAN) and inference strategy to compress sequence information into an anchor token. […]
Pocache is a lightweight in-app caching package that optimizes performance in concurrent environments by preemptively updating cache entries nearing expiration, reducing redundant database calls. It features a configurable threshold window, serving stale values, and debouncing concurrent requests. The package uses Hashicorp’s Go LRU package as the default storage and allows for custom underlying storage. By […]
In this study, we delve into the generalization properties of binary logistic classification, showcasing the dynamics of Grokking in a random feature model. We discover that Grokking, characterized by delayed generalization and non-monotonic test loss, is enhanced when the model is applied to training sets near linear separability. Despite a perfect generalizing solution being available, […]
The Pre-Scheme Restoration project has made significant progress in porting the codebase to R7RS Scheme implementations. The use of “The Incomplete Scheme48 R7RS Compatibility Library” has helped maintain compatibility with Scheme 48 during the porting process. Scsh was adopted as a tooling platform for better support with filesystem and external processes. Implementing R7RS-small for Scheme […]
Darya Kawa Mirza, a self-taught Kurdish astrophotographer, recently showcased the moon in exceptional detail by stitching together 81,000 images into a massive 708-gigabyte composite. His work highlights the intricacies of the lunar topography, showcasing individual craters and colored spots formed by asteroid strikes and volcanic eruptions. Mirza’s technique of “phase fusion” involved piecing together images […]
Swyx catches up with former guests and welcomes Eugene Cheah as a rare guest writer, who cofounded Featherless.AI, an inference platform with 2,000 open source models accessible via a single API for a flat rate. NVIDIA’s Blackwell series rollout to OpenAI has caused excitement, while cousin Lisa introduces MI3 25X and Cerebras files for IPO. […]
In this web content, the author discusses their interest in unconventional programming paradigms, specifically learning languages like LISP and Prolog. The author notes the challenges of studying Prolog due to its esoteric syntax, but also highlights the revolutionary novelty of its data structure approach. The content delves into the use of tags in Prolog to […]