·  Compiler Construction Toolkit

Define the language.
Parse() handles the rest.

A complete source-to-binary compiler pipeline in a single fluent Delphi API. Lexer. Parser. Semantics. C++23 codegen. Native binary via Zig.

⬇ Download Toolchain ⎇ View on GitHub
Built with Delphi  ·  Zig backend  ·  Windows + Linux
The Pipeline

One config.
Every stage.

A single TParse object drives all five stages of compilation. There is no hardcoded language knowledge anywhere in the toolkit. You describe the language. The pipeline runs it.

📄
Source Text
Your language file
token stream
Lexer
Keywords · Operators · Strings · Comments
AST nodes
⟨⟩
Pratt Parser
Prefix · Infix · Statement handlers
enriched AST
Semantics
Scopes · Symbols · Types · PARSE_ATTR_*
.h + .cpp
{ }
C++23 CodeGen
Fluent IR emitter · Dual-file output
native binary
Zig / Clang
exe · lib · dll  ·  Win64 · Linux64
Configuration Surfaces

Three surfaces.
One language.

Surface 01  /  Lexer
What tokens exist

Register every token your language uses: keywords, operators, string delimiters, comment styles, number prefixes. The lexer has no built-in language knowledge.

AddKeyword AddOperator AddStringStyle AddBlockComment
Surface 02  /  Grammar
What tokens mean

Register Pratt parser handlers for prefix, infix-left, infix-right, and statement positions. Binding powers control precedence. Every AST node is yours to shape.

RegisterPrefix RegisterInfixLeft RegisterStatement RegisterBinaryOp
Surface 03  /  Emit
What nodes produce

Register an emitter for every AST node kind. The fluent IR API generates well-formed C++23 text. ExprToString converts expression trees to inline C++ strings.

RegisterEmitter AGen.Func AGen.IfStmt ExprToString
Showcase Languages

Four syntaxes.
One toolkit.

Parse() ships with four sample language implementations. Pascal, Lua, BASIC, and Scheme prove the toolkit handles radically different syntax families without any changes to the framework.

Pascal
Classic Structured Pascal

Case-insensitive keywords, := assignment, begin/end blocks, typed variables, typed procedures and functions. The Result return convention. Full C++23 forward declarations emitted to the header file.

  • Typed variable declarations with var blocks
  • Procedures and functions with typed parameters
  • if/then/else, while/do, for/to/downto/do
  • writeln with multiple comma-separated arguments
  • Symbol resolution and semantic error reporting
HelloWorld.pas
program HelloWorld;

var
  greeting: string;
  count: integer;

procedure PrintBanner(msg: string);
begin
  writeln('--- ', msg, ' ---');
end;

function Add(a: integer; b: integer): integer;
begin
  Result := a + b;
end;

begin
  greeting := 'Hello, World!';
  count := 5;
  PrintBanner(greeting);
  writeln('5 + 3 = ', Add(5, 3));
  for i := 1 to count do
    writeln('  Step ', i);
end.
Lua
Dynamically Typed Lua

No type annotations anywhere. Literal-based type inference at declaration sites. Call-site pre-scan for parameter type inference. local/global scope, .. string concatenation.

  • Zero type annotations, all inference from literals
  • function/end with inferred return type
  • if/then/elseif/else/end, for/do/end, while/do/end
  • .. string concatenation maps to <<
  • print() variadic output via std::cout
HelloWorld.lua
-- HelloWorld.lua

local greeting = "Hello, World!"
local count = 5

function PrintBanner(msg)
  print("--- " .. msg .. " ---")
end

function Add(a, b)
  return a + b
end

PrintBanner(greeting)
print("5 + 3 = " .. Add(5, 3))

for i = 1, count do
  print("  Step " .. i)
end
BASIC
Classic BASIC

Implicit program body, no header keyword required. Dim/As typed declarations. The = operator serves dual duty: assignment at statement level, equality inside expressions.

  • Dim/As typed variable declarations
  • Sub/End Sub and Function/End Function
  • If/Then/Else/End If, For/To/Next, While/Wend
  • & string concatenation
  • Function-name return convention: Add = a + b
HelloWorld.bas
Dim greeting As String
Dim count As Integer

Sub PrintBanner(msg As String)
  Print "--- " & msg & " ---"
End Sub

Function Add(a As Integer, b As Integer) As Integer
  Add = a + b
End Function

greeting = "Hello, World!"
count = 5
PrintBanner(greeting)
Print "5 + 3 = " & Add(5, 3)

For i = 1 To count
  Print "  Step " & i
Next i
Scheme
S-Expression Scheme

A single ( handler drives all parsing, with no infix operators, no block keywords, and no statement terminator. Everything is an S-expression. Kebab-case identifiers are mangled to snake_case for C++.

  • Zero RegisterInfixLeft calls: pure prefix dispatch
  • (define var expr) and (define (f args) body)
  • (if cond then else), (begin expr...)
  • Kebab-case to snake_case name mangling
  • #t / #f boolean literals
HelloWorld.scm
; HelloWorld.scm

(define greeting "Hello, World!")
(define count 5)

(define (print-banner msg)
  (display "--- ")
  (display msg)
  (display " ---")
  (newline))

(define (add a b)
  (+ a b))

(print-banner greeting)
(display "5 + 3 = ")
(display (add 5 3))
(newline)
Capabilities

What ships in the box

One Config, One Language

A single TParse drives every stage. Define it once; the lexer, parser, semantics, and codegen all read from it.

⟨⟩
Pratt Parser Built In

Top-down operator precedence parsing ready to use. Register handlers, set binding powers. No grammar files, no parser generators required.

🌳
Attribute-Store AST

Every node carries a string-keyed TValue dictionary. Pipeline stages communicate through attributes. No coupling between components.

Semantic Engine

Built-in scope trees, symbol declaration and lookup, type compatibility checking, and coercion annotation. Register only the handlers you need.

{ }
C++23 Fluent IR

Structured builder API generates well-formed C++23. Functions, control flow, expressions: all fluent. No raw string-formatting.

■□
Dual-File Output

Generates both a .h header and .cpp source. Language authors control which output receives each statement.

Zig Build Backend

Generated C++ compiles to native binaries via Zig/Clang. exe, lib, dll. Win64 and Linux64. debug, release-safe, release-fast, release-small.

Type Inference Surface

Built-in literal-type mapping and call-site scanning for dynamically-typed languages. Infer variable types from initialisers with no explicit annotations.

📡
LSP-Ready AST

After semantic analysis the AST is fully self-sufficient. Every node carries resolved type, symbol, scope, and storage. Ready for language server integration.

How It Works

From config to binary
in five steps

01
Download the toolchain

Grab parsekit-toolchain.zip from the permanent release entry and unzip it directly into the root of your ParseKit source directory.

02
Clone the source

Clone or download the ParseKit repo. Open the project group in Delphi 11 or later and build. uses Parse; is the only unit you need.

03
Configure the lexer surface

Register keywords, operators, string styles, and comment styles. Set the statement terminator and block delimiters for your language.

04
Register grammar and emit handlers

Wire up prefix, infix, and statement handlers to build your AST. Register emitters to turn each node kind into C++23. Add semantic rules as needed.

05
Call Compile()

Set source file, output path, platform, and build mode. Call Compile(True) and Parse() tokenizes, parses, analyses, emits C++, compiles, and optionally runs the result.

MyLang.pas
var
  LParse: TParse;
begin
  LParse := TParse.Create();
  try
    // ── Lexer surface ──────────────────
    LParse.Config()
      .CaseSensitiveKeywords(False)
      .AddKeyword('print', 'keyword.print')
      .AddOperator('(', 'delimiter.lparen')
      .AddOperator(')', 'delimiter.rparen')
      .AddStringStyle('"', '"',
        PARSE_KIND_STRING, True)
      .SetStatementTerminator('');

    LParse.Config().RegisterLiteralPrefixes();

    // ── Grammar surface ────────────────
    LParse.Config().RegisterStatement(
      'keyword.print', 'stmt.print',
      function(AParser: TParseParserBase)
        : TParseASTNodeBase
      var
        LNode: TParseASTNode;
      begin
        LNode := AParser.CreateNode();
        AParser.Consume();
        AParser.Expect('delimiter.lparen');
        LNode.AddChild(TParseASTNode(
          AParser.ParseExpression(0)));
        AParser.Expect('delimiter.rparen');
        Result := LNode;
      end);

    // ── Emit surface ───────────────────
    LParse.Config().RegisterEmitter(
      'stmt.print',
      procedure(ANode: TParseASTNodeBase;
        AGen: TParseIRBase)
      begin
        AGen.Include('iostream');
        AGen.Stmt('std::cout << ' +
          LParse.Config().ExprToString(
            ANode.GetChild(0)) +
          ' << std::endl;');
      end);

    LParse.SetSourceFile('hello.mylang');
    LParse.SetOutputPath('output');
    LParse.SetTargetPlatform(tpWin64);
    LParse.SetBuildMode(bmExe);
    LParse.SetOptimizeLevel(olDebug);

    LParse.Compile(True);
  finally
    LParse.Free();
  end;
end;
Get Parse()
Everything you need
to ship a language

The toolchain zip contains the Zig compiler and C++ runtime. Unzip it into the root of your ParseKit source directory. The source is on GitHub. Delphi 11 or later required to build.

Permanent release entry: the URL never changes. When the toolchain updates, the same entry is replaced in place.  ·  GitHub  ·  Docs

Requirement Minimum Notes
Host OS Windows 10/11 x64 Supported
Delphi Delphi 11 Alexandria To build the toolkit from source
Linux target WSL2 + Ubuntu wsl --install -d Ubuntu  ·  Parse() locates it automatically