Files and Directories in sh2

This tutorial covers practical patterns for working with files and directories in sh2. You’ll learn how to list, find, copy, and delete files safely—without the quoting bugs that plague Bash scripts.

What you’ll learn:

Why file operations are error-prone in Bash
How sh2’s “no implicit expansion” rule prevents common bugs
Safe patterns for find, iteration, and deletion
Using with cwd(...) for scoped directory changes
Logging with with redirect { ... }

Prerequisites: Complete Tutorial 01: Getting Started and Tutorial 03: Error Handling.

1. Why Files Are Hard in Bash

File operations in Bash are a minefield. The culprits:

Word splitting: Spaces in filenames become multiple arguments
Globbing: * and ? expand unexpectedly
Unquoted variables: rm -rf $dir can delete your entire system

A tiny Bash script that breaks on spaces

# ❌ Dangerous Bash script
dir="/tmp/my project"
cd $dir              # Fails: tries to cd to "/tmp/my" then "project"
rm -rf $dir/*.bak    # Fails: word splits the path

If $dir isn’t quoted, Bash splits it on spaces. If $dir is empty, rm -rf /*.bak runs on root.

Even experienced developers get bitten:

# ❌ Still dangerous
for f in $(find . -name "*.txt"); do
    rm "$f"          # Breaks on filenames with spaces
done

The $(...) command substitution splits output on whitespace. A file named my file.txt becomes two arguments: my and file.txt.

2. The sh2 Rule That Changes Everything

sh2 performs no implicit expansion. Strings are strict literals.

This single rule prevents most file-handling bugs:

func main() {
    run("echo", "*")          # Prints: *  (literal asterisk)
    run("echo", "hello world") # One argument, not two
}

Test it:

sh2do 'run("echo", "*")'

Output:

In Bash, echo * would list all files. In sh2, "*" is just an asterisk.

Paths with spaces remain one argument

func main() {
    let path = "/tmp/my project/data file.txt"
    run("cat", path)  # Always one argument, no quoting needed
}

You never need to think about quoting. Every argument is safely separated.

3. Listing Files Safely

Simple listing with `run("ls", ...)`

func main() {
    run("ls", "-la", "/tmp")
}

Safe flat iteration with `glob()` (Bash only)

For simple file matching in the current directory (flat, non-recursive), use glob():

func main() {
    # glob() returns a sorted list of matches
    for f in glob("*.txt") {
        print("Found text file: " & f)
    }
    
    # Check if empty (no matches = empty list)
    let logs = glob("*.log")
    if count(logs) == 0 {
        print("No logs found")
    }
}

Capturing file lists with `find`

For programmatic use, find with capture and lines:

func main() {
    let files = capture(run("find", ".", "-type", "f", "-name", "*.txt"))
    
    for f in lines(files) {
        if f != "" {
            print("Found: " & f)
        }
    }
}

Key points:

capture(run("find", ...)) returns newline-separated output
lines(...) splits into an iterable list
The if f != "" check handles trailing newlines

Honest limitation: newlines in filenames

If a filename contains a literal newline character (rare but possible), the lines() approach breaks. For maximum safety, use find -print0 with a shell helper—but this is almost never needed in practice.

4. Three Common Tasks (Mini-Tools)

Task A: Print the 10 largest files under a directory

# largest-files.sh2
# Prints the 10 largest files in a directory.

func usage() {
    print("Usage: largest-files.sh <directory>")
}

func main() {
    if argc() < 1 {
        usage()
        return 1
    }
    
    let dir = arg(1)
    
    if !is_dir(dir) {
        print_err("Error: '" & dir & "' is not a directory")
        return 1
    }
    
    # find all files, get sizes, sort numerically, take top 10
    let result = capture(
        run("find", dir, "-type", "f", "-exec", "du", "-h", "{}", ";")
        | run("sort", "-rh")
        | run("head", "-n", "10"),
        allow_fail=true
    )
    
    if status() != 0 {
        print_err("Error finding files")
        return 1
    }
    
    print("Top 10 largest files in " & dir & ":")
    print(result)
}

Compile and run:

sh2c largest-files.sh2 -o largest-files.sh
./largest-files.sh /var/log

Task B: Delete .bak files older than N days (with confirmation)

# cleanup-bak.sh2
# Deletes .bak files older than N days with confirmation.

func usage() {
    print("Usage: cleanup-bak.sh <directory> [--dry-run]")
    print("")
    print("Options:")
    print("  --dry-run    Show what would be deleted without deleting")
}

func main() {
    if argc() < 1 {
        usage()
        return 1
    }
    
    let dir = arg(1)
    let dry_run = false
    
    if argc() >= 2 {
        if arg(2) == "--dry-run" {
            set dry_run = true
        }
    }
    
    if !is_dir(dir) {
        print_err("Error: '" & dir & "' is not a directory")
        return 1
    }
    
    # Find .bak files older than 7 days
    let files = capture(
        run("find", dir, "-name", "*.bak", "-mtime", "+7", "-type", "f"),
        allow_fail=true
    )
    
    if status() != 0 {
        print_err("Error searching for files")
        return 1
    }
    
    # Count files
    let count = 0
    for f in lines(files) {
        if f != "" {
            set count = count + 1
        }
    }
    
    if count == 0 {
        print("No .bak files older than 7 days found.")
        return 0
    }
    
    print($"Found {count} .bak file(s) older than 7 days:")
    for f in lines(files) {
        if f != "" {
            print("  " & f)
        }
    }
    
    if dry_run {
        print("[DRY RUN] Would delete " & count & " file(s)")
        return 0
    }
    
    # Confirm before deletion
    if !confirm($"Delete {count} file(s)?", default=false) {
        print("Aborted.")
        return 0
    }
    
    # Delete each file safely
    let failures = 0
    for f in lines(files) {
        if f != "" {
            run("rm", "--", f, allow_fail=true)
            if status() == 0 {
                print("Deleted: " & f)
            } else {
                print_err("Failed to delete: " & f)
                set failures = failures + 1
            }
        }
    }
    
    if failures > 0 {
        print_err("Completed with " & failures & " failure(s)")
        return 1
    }
    
    print("Done.")
}

Key patterns:

run("rm", "--", f) — The -- ensures filenames starting with - aren’t treated as options
confirm(..., default=false) — Non-interactive mode safely does nothing
--dry-run — Always offer a preview mode for destructive operations

Task C: Copy matching files to a destination

# copy-logs.sh2
# Copies all .log files to a destination directory.

func usage() {
    print("Usage: copy-logs.sh <source-dir> <dest-dir>")
}

func main() {
    if argc() < 2 {
        usage()
        return 1
    }
    
    let src = arg(1)
    let dest = arg(2)
    
    if !is_dir(src) {
        print_err("Error: source '" & src & "' is not a directory")
        return 1
    }
    
    # Create destination if needed
    run("mkdir", "-p", "--", dest, allow_fail=true)
    if status() != 0 {
        print_err("Error: could not create destination directory")
        return 1
    }
    
    let files = capture(
        run("find", src, "-name", "*.log", "-type", "f"),
        allow_fail=true
    )
    
    if status() != 0 {
        print_err("Error searching for files")
        return 1
    }
    
    let copied = 0
    for f in lines(files) {
        if f != "" {
            # Use --target-directory for safe copy
            run("cp", "--", f, dest, allow_fail=true)
            if status() == 0 {
                print("Copied: " & f)
                set copied = copied + 1
            } else {
                print_err("Failed to copy: " & f)
            }
        }
    }
    
    print("Copied " & copied & " file(s) to " & dest)
}

Key patterns:

run("mkdir", "-p", "--", dest) — Create directory tree safely
run("cp", "--", f, dest) — The -- protects against filenames starting with -

5. Scoped Working Directory

Using `with cwd("/path") { ... }`

Change the working directory for a block only:

func main() {
    print("Before: " & pwd())
    
    with cwd("/tmp") {
        print("Inside: " & pwd())
        run("ls", "-la")
    }
    
    print("After: " & pwd())  # Back to original
}

The block is scoped: After the with cwd(...) block ends, you’re back to the original directory automatically.

Why it requires a string literal

The path in cwd(...) must be a literal string—not a variable:

# ✅ Works
with cwd("/tmp/build") {
    run("make")
}

# ❌ Compile error: cwd requires literal path
let dir = "/tmp"
with cwd(dir) { ... }

Why? This is a safety restriction. Dynamic cd paths are a common source of injection bugs. By requiring literals, sh2 forces you to be explicit about where your script runs.

Workaround for dynamic paths

If you genuinely need a dynamic working directory, use this safe pattern:

func main() {
    let dir = arg(1)
    
    # Validate first
    if !is_dir(dir) {
        print_err("Not a directory: " & dir)
        return 1
    }
    
    # Safe pattern: pass path as argument to sh -c
    # sh(...) because: dynamic cwd requires shell; path passed as $1 prevents injection
    run("sh", "-c", "cd -- \"$1\" && ls -la", "sh2", dir)
}

Caveat: This uses run("sh", "-c", ...) which bypasses some sh2 safety. Always:

Validate the path first with is_dir()
Pass the variable as a positional argument ($1)
Never interpolate directly into the command string

6. Logging with Redirects

Basic file logging

func main() {
    with redirect { stdout: file("output.log") } {
        print("This goes to the file")
        run("date")
    }
    
    print("This goes to terminal")
}

Logging to file AND terminal (fan-out)

func main() {
    with redirect { 
        stdout: [file("output.log"), inherit_stdout()],
        stderr: [file("output.log", append=true), inherit_stderr()]
    } {
        print("Visible on terminal AND logged to file")
        run("some-command")
    }
}

Real example: install script with logging

# install-prereqs.sh2
# Installs prerequisites with full logging.

func ensure_log_dir() {
    run("mkdir", "-p", "logs", allow_fail=true)
}

func main() {
    ensure_log_dir()
    let log = "logs/install.log"
    
    print("Installing prerequisites...")
    print($"Log file: {log}")
    
    with redirect { 
        stdout: [file(log, append=true), inherit_stdout()],
        stderr: [file(log, append=true), inherit_stderr()]
    } {
        print("---")
        print("Started: " & capture(run("date")))
        
        print("Updating package lists...")
        sudo("apt-get", "update", n=true, allow_fail=true)
        if status() != 0 {
            print_err("Warning: apt-get update failed")
        }
        
        print("Installing build-essential...")
        sudo("apt-get", "install", "-y", "build-essential", n=true, allow_fail=true)
        if status() != 0 {
            print_err("Failed to install build-essential")
            return 1
        }
        
        print("Finished: " & capture(run("date")))
    }
    
    print("Installation complete. Check " & log & " for details.")
}

7. Rules of Thumb

When to use `find -delete` vs loop with `rm --`

Use `find -delete`	Use loop with `rm --`
Simple bulk deletion	Need per-file logging
No confirmation needed	Need confirmation per file
Trust the `find` expression	Want to validate each path

With find -delete:

run("find", ".", "-name", "*.bak", "-mtime", "+30", "-delete")

With loop:

for f in lines(files) {
    if f != "" {
        run("rm", "--", f, allow_fail=true)
        if status() == 0 { print("Deleted: " & f) }
    }
}

When to prefer pipelines vs structured steps

Use pipelines	Use structured steps
All stages are trustworthy	Need error handling per stage
Output is small	Output is large/streaming
One-liner clarity	Multi-line clarity

Pipeline (compact):

let top = capture(run("find", ".") | run("wc", "-l"))

Structured (debuggable):

let files = capture(run("find", "."), allow_fail=true)
if status() != 0 { return 1 }
let count = capture(run("wc", "-l"), allow_fail=true)

When to promote sh2do to a .sh2 script

Keep as sh2do	Promote to .sh2 file
Quick one-off	Will run again
Testing an idea	Needs arguments
Interactive exploration	Needs review/versioning
<5 statements	Needs functions

8. Common Mistakes

Mistake 1: Forgetting `--` before paths

# ❌ Breaks if path starts with -
run("rm", path)

# ✅ Safe for any path
run("rm", "--", path)

Mistake 2: Not checking `is_dir()` before operations

# ❌ Could operate on wrong path
let dir = arg(1)
run("rm", "-rf", "--", dir)

# ✅ Validate first
if !is_dir(dir) {
    print_err("Not a directory: " & dir)
    return 1
}

Mistake 3: Skipping the empty-string check in loops

# ❌ May process empty string as filename
for f in lines(files) {
    run("rm", "--", f)  # Could run "rm --" with no file
}

# ✅ Filter empty lines
for f in lines(files) {
    if f != "" {
        run("rm", "--", f)
    }
}

Next Steps

You now know how to handle files and directories safely in sh2. Here’s where to go next:

Reference docs

Language Reference — Full syntax and semantics
No Implicit Expansion — Why strings are strict literals
Logging and Redirects — Fan-out, file logging

Error Handling — Handle failures in file operations

Happy file wrangling! 📁

Source Code & Repository
👉 https://github.com/siu-mak/sh2lang