Back to blog A duck in flight rendered in the GitHits brand style, evoking DuckDB

June 3, 2026 · 5 min read

Case Study: DuckDB API Migration With GitHits

How GitHits helped Codex find the DuckDB v1.3.2 source behavior that a vanilla run missed.

The DuckDB replay on the homepage shows a correctness failure mode that comes up often when agents work with dependency APIs.

Both runs used the same model, Codex GPT-5.5, against the same small C++ fixture. The only meaningful difference was whether the agent could use GitHits. The task looked straightforward at first:

Fix web_archive_scan.cpp so it is source-correct for DuckDB v1.3.2 C++ table-function projection and complex-filter pushdown API.

The stale fixture came from older DuckDB table-function examples. It needed a version-specific update for DuckDB v1.3.2, kept local to web_archive_scan.cpp, while preserving projection and filter pushdown behavior.

That behavior was the hard part. A patch could compile and still be wrong if it did not preserve how DuckDB maps projected columns and prunes columns used only by pushed filters.

Case study replay

DuckDB API migration

model Codex GPT-5.5
$

Fix web_archive_scan.cpp so it is source-correct for DuckDB v1.3.2 C++ table-function projection and complex-filter pushdown API.

Without GitHits

Incomplete
tokens
0
time
0s / 496s
  1. Ready. Click "Watch Replay" to start.

With GitHits

Complete
tokens
0
time
0s / 327s
  1. Ready. Click "Watch Replay" to start.

What happened

With GitHits, the agent finished in 327 seconds, processed 1.41M total tokens, used 40 tools, and produced a complete fix.

Without GitHits, vanilla Codex finished in 496 seconds, processed 1.73M total tokens, used 48 tools, and produced a patch that was compile-correct but incomplete.

GitHits was about 34% faster and used about 19% fewer processed tokens. More importantly, it found the missing semantic detail: the filter_prune path that enables filter-only column pruning.

The vanilla run reached a plausible source-level patch. It updated API shapes and got through syntax checks. It missed the path that tells DuckDB how to retain columns required only by pushed filters. In a table function, that is exactly the kind of bug that can hide behind a clean compile.

Why this task was hard

The broken fixture mixed several kinds of API drift:

  • The pushdown_complex_filter callback signature had changed.
  • Projection handling needed to account for DuckDB’s column_t and projection_ids behavior.
  • TableFunctionSet needed the right include path.
  • Complex filter pushdown had to cooperate with column pruning instead of only erasing filters from a local vector.

A normal agent can search the web, fetch raw headers, or clone a sparse checkout. That is what the vanilla run did. The cost is orientation work. The agent has to decide which branch, which header, which optimizer file, and which usage site matters for the current version.

GitHits reduced that orientation cost. The GitHits run moved quickly from the fixture into version-specific DuckDB source evidence:

  • table_function.hpp for the v1.3.2 callback and init inputs.
  • projection_ids usage to understand projection mapping.
  • remove_unused_columns.cpp to confirm how filter-only columns are preserved.
  • logical_get.cpp and pushdown_get.cpp to connect the table function API to optimizer behavior.
  • function_set.hpp to settle the TableFunctionSet include.

That evidence changed the solution. The agent could follow how DuckDB routes projected columns and pushed filters through the planner instead of relying on old examples or a single header signature.

The hidden failure mode

The fixture contained this warning in the stale code:

// Older examples used column_ids directly. This is wrong when DuckDB has
// produced projection_ids for a filtered/projection-pushed scan.

That comment points at the right area, but it is incomplete. projection_ids explains visible output columns. It does not fully explain columns needed only to evaluate filters that have been pushed into the scan.

That is why the incomplete vanilla result is instructive. It did enough source work to avoid obvious compilation errors. It still stopped one step before the behavior that mattered.

For an AI coding agent, this is a common failure mode: it finds a shape that satisfies the compiler and then treats the compile check as stronger evidence than it really is. In C++ library migrations, especially around optimizer or planner APIs, compile-correct can still be semantically stale.

Why GitHits performed better

GitHits helped because it gave the agent a shorter path to the source files that define the behavior.

The useful context was the exact behavior of DuckDB v1.3.2 in the files that implement table functions and optimizer pruning.

Once the agent could ask targeted questions against that source, it found the decisive concepts earlier:

  • The new pushdown_complex_filter shape.
  • The projection mapping between requested output columns and table function state.
  • The role of filter_prune when filters reference columns that are not otherwise projected.
  • The include boundary for TableFunctionSet.

The vanilla run did real work: it fetched headers, made a sparse checkout, inspected optimizer files, and ran syntax checks. Much of that effort went into source acquisition and orientation. GitHits let the agent spend more of the run on reading the relevant source and applying it.

The practical lesson is simple: agent quality depends heavily on what code the agent can inspect before it decides. When the task depends on an implementation detail buried in a dependency, access to that source can be the difference between a patch that compiles and a patch that is complete.

What this case says about agents

This is the kind of task where AI coding agents are useful, but fragile.

They can read local fixtures, infer intent, write C++ patches, and run checks. The fragile part is dependency truth. If the agent is reasoning from stale examples or partial source, it can converge on the nearest compile-correct answer instead of the correct answer.

GitHits helps by giving the agent a way to ground those decisions in the dependency’s real implementation. For DuckDB, that meant following the planner and table-function source far enough to catch the column-pruning edge case.

The result was a more complete fix with less wasted exploration.