Last updated 2 years ago by Brian Gesiakswift
The prior article in this series explained how the Swift and Clang compilers used llvm::SourceMgr to emit diagnostics for source locations in memory buffers, represented by the class llvm::MemoryBuffer. This article focuses on llvm::MemoryBuffer, the primary abstraction for reading files and streams into memory. Since it's used by Swift, Clang, and LLVM tools like llvm-tblgen, I found it valuable to understand how it works.
The documentation for libLLVMSupport's llvm::MemoryBuffer class says it "provides simple read-only access to a block of memory, and provides simple methods for reading files and standard input into a memory buffer." To better understand how it does that, I tried writing a simple C++ program, called read.cpp, that reads a file – itself, in this case – into memory. For simplicity's sake my program is only meant to operate on Unix systems.
My read.cpp program reads a file into memory by using various system calls. These are requests made to the operating system for things like "open a file and give me its file descriptor," or "read 8 bytes from the file with this file descriptor." Julia Evans has a wonderful comic that explains them further:
My read.cpp program uses four system calls:
Once the read.cpp program allocates memory and reads its own source file into that memory, it increments the char * pointer into the memory and prints out the first line of the file: .