Does the claim look like known gold ground?

The one result in this study with measured skill. Every other arm here is a prioritization layer with no ground truth: an iron-oxide ratio, a structural lineament, a clay anomaly, each honestly labelled "we cannot tell you if it works, because there are no assays." This arm is different. It asks a question we can actually grade, and it earns a score.

The question: does a place's AlphaEarth satellite embedding resemble the embeddings at the belt's 85 confirmed gold occurrences more than it resembles random country rock? AlphaEarth compresses years of optical, radar, and elevation observation into a 64-number fingerprint per 10 m of ground. If gold deposits sit in ground that shares a fingerprint, a model trained on the known deposits should recognize an unseen one. We can test exactly that by hiding deposits and checking whether the model still finds them. It does.

This is the first arm with a held-out validation number, so it leads the report. The skill is real and measured: an unseen gold site, hidden from training, still lands at the 99th percentile of background and separates from country rock with AUC 0.96 (0.5 is chance). What that skill says about this particular claim is honest but unexciting: intermediate. The method works; the claim verdict is "no clear resemblance either way," not "gold."

Method

This arm reads the AlphaEarth Embedding Fields v1 annual mosaic directly from its public Cloud-Optimized GeoTIFFs on Source Coop (@tge-labs/aef, CC-BY-4.0), over plain HTTPS range requests through /vsicurl/. Each of the ~12 belt tiles (2024, UTM 35S + 36S) is opened once at overview factor 8 (80 m effective), and sample points are then indexed straight out of the in-memory arrays. The int8 embeddings are dequantised as deq = (raw / 127.5)^2 * sign(raw), with the fill value mapped to NaN.

The recipe is research/mineral-prospecting/embedding-analog/prospectivity_aef.py.

How the analog is built

  1. Gold reference matrix. The 85 confirmed gold occurrences of the belt (Bartholomew/Blenkinsop ZGS inventory; the 55 Great-Dyke asbestos/chromium workings are excluded). Each is sampled as a 3x3-pixel (~240 m) window mean to absorb the ~0.5 km coordinate error in the inventory.
  2. Background matrix. 1,483 random belt points, each kept at least 1,500 m from every occurrence, so "background" means country rock, not undiscovered deposit.
  3. Model. A RandomForest (300 trees, max depth 12, balanced class weight) learns P(gold) directly on the raw 64-D embedding. A cosine-to-gold-prototype score is reported alongside as a transparent, model-free cross-check.
Belt map of the 85 confirmed gold occurrences, the working sites, the background points, and the claim AOI
The reference frame. Confirmed gold occurrences (the 85 positives), the excluded Great-Dyke workings, the random country-rock background, and the claim AOI on the western margin. The nearest gold is 11 km east, where the greenstone belt proper begins.

The validation: leave-one-out skill

This is the gate, and it is what makes the arm worth leading with. Hold each gold site out of training, refit the model, score the held-out site, and record where it lands relative to background. A site the model has never seen either still looks like gold (skill) or does not (no skill). Averaged over all 85, this is an honest out-of-sample test.

Leave-one-out metric Value (50 / 0.5 = chance)
Gold sites tested 85
Background points 1,483
Median held-out percentile 99.12
Mean held-out percentile 96.16
Fraction above the 50th percentile 0.988
Fraction above the 90th percentile 0.894
AUC, held-out gold vs background 0.9608

The median held-out gold site lands at the 99th percentile of background, and held-out gold separates from country rock with AUC 0.96, where 0.5 is a coin flip and 1.0 is perfect. The AlphaEarth embedding analog generalises: an unseen gold site still looks like the known ones. This is the first measured skill anywhere in the study, and it is strong.

Why this number governs everything below. The claim verdict is only as trustworthy as this validation. If leave-one-out had come back near chance, the analog would have no skill on this belt and the claim score would be meaningless in either direction. It did not come back near chance. It came back at AUC 0.96. So the claim score, whatever it says, is worth reading.

The claim verdict: intermediate (honest)

With a validated model in hand, we score a dense 80 m grid over the claim AOI and compare it to the background distribution.

Claim metric Value
Claim median P(gold) 0.0388
Background median P(gold) 0.0167
Gold (in-sample) median P(gold) 0.7465
Claim median percentile vs background 71.5
Fraction of claim exceeding the gold self-median 0.0
Cosine to gold prototype: gold / background / claim 0.941 / 0.927 / 0.871
RandomForest P(gold) over the claim AOI at 80 m, magma colormap
Embedding-analog P(gold) across the claim at 80 m. Brighter is more gold-like. The claim sits above country rock but well below the gold band; no part of it reaches the level of the known deposits. The same raster is on the interactive map as a toggleable overlay.

The verdict is intermediate. The claim's embedding sits at about the 71st percentile of background: above ordinary country rock, but far below the gold band (median P(gold) 0.039 for the claim, versus 0.017 for background and 0.75 for known gold). No clear resemblance either way. Not one cell of the claim grid reaches the median score of the known deposits.

That is a genuinely interesting non-answer. The claim is on the Great Dyke ultramafic margin, about 11 km from the nearest greenstone-belt gold, in a different lithological setting. A claim there reading above background, even modestly, is worth noting. But "above background, below gold" is exactly that, an intermediate resemblance score, and nothing more.

This is not a claim that gold is present. A high embedding-analog score means "this ground's multi-temporal optical, radar, and elevation signature, as compressed by AlphaEarth, looks like the ground at known gold sites." It does not mean gold is in the rock. AlphaEarth embeds surface appearance and seasonality, not subsurface mineralisation. This is a prospectivity heuristic, not gold detection. The claim's score is intermediate, which here means inconclusive.

Caveats (the same honesty as everywhere else)


Provenance

One-line summary for any briefing: AlphaEarth 64-D embeddings read from @tge-labs/aef COGs via HTTPS range requests; a RandomForest trained on 85 confirmed gold occurrences shows real, measured skill (leave-one-out AUC 0.96) at recognizing gold ground in this belt; applied to the claim it reads intermediate, above country rock but below the gold band, which is a resemblance diagnostic and not evidence that gold is present.