Rigorous Analysis with AI Agents

Day 2 - Block 4

AI is fuzzy, code is not

The hottest new programming language is English
— Andrej Karpathy ((karpathy?)) January 24, 2023

Tips

Open source is powerful

AI can edit the code of the tool!

Get comfortable with markdown

Language that AI uses!

Get comfortable in the terminal

simple, scriptable!

Stata-first workflow

Use AI to draft .do files and comments
Run locally, inspect logs, iterate with errors
Keep one verification note per run

Unit-tests


import numpy as np

def ols_beta(X, y):
    ...

def test_ols_function():
    x = np.array([1, 2, 3, 4])
    X = np.column_stack([np.ones(len(x)), x])
    y = 2 + 3 * x  # true model

    beta_hat = ols_beta(X, y)

    assert np.allclose(beta_hat, [2.0, 3.0], atol=1e-10)

Write these yourself or in separate AI session!

flowchart LR
  P[Prompt] --> C[AI writes code]
  C --> R[Run code]
  R --> T[Unit tests]
  T -->|pass| K[Keep]
  T -->|fail| F[Fix and rerun]
  F --> C

Write code to solve tasks!

Example: Edit Excel with AI

link

import pandas as pd

df = pd.read_excel("raw_survey.xlsx", sheet_name="responses")
df["age_bucket"] = pd.cut(df["age"], bins=[17, 29, 44, 65, 120])
df.to_excel("processed_survey.xlsx", index=False)

Example: Autoresearch

link

Example: Autoresearch

link

Data privacy: hide data, expose metadata [^s6]

flowchart LR
  D[SPSS raw data] -->|hidden| X[No direct model access]
  M[Metadata: columns labels scales types] --> A[Agent writes analysis code]
  A --> H[Human runs code locally]
  H --> O[Verified output]

Share schema, not records
Agent drafts code safely
Human executes on secure machine

Specific modes for coding agent

    "plan": {
      "tools": {
        "write": false,
        "edit": false,
        "bash": false,
        "read": true,
        "grep": true,
        "glob": true
      }
    }

Specific modes for coding agent

"data": {
      "tools": {
        "write": true,
        "edit": false,
        "bash": false,
        "read": true,
        "grep": true,
        "glob": true
      }
    }

Restrict access of coding agent

  read: {
        "*": "allow",
        "*.env": "ask",
        "sensitive_*": "deny"
      }

Rigorous Analysis with AI Agents

AI is fuzzy, code is not

Tips

Stata-first workflow

Unit-tests

Example: Edit Excel with AI

Example: Autoresearch

Example: Autoresearch

Data privacy: hide data, expose metadata [^s6]

Specific modes for coding agent

Specific modes for coding agent

Restrict access of coding agent

References