Pythia — Methodology
</> Open source code
ⓘ About this experiment
What Pythia measures and why

Pythia measures the recommendation flow retail investors receive from frontier AI systems. People already ask these models what to buy. This site records the tickers named, the stance attached to each mention, and how the answer changes by persona.

How a night runs

Each night runs 10 questions through 2 personas and 3 configured model surfaces, for 60 responses when every call succeeds. Each model is invoked through its production agent harness: Claude Code for claude, the Codex CLI for gpt, and the google-genai SDK with Google Search grounding for gemini, with web and search tools available where the surface supports them. Consumer chat surfaces (chatgpt.com, claude.ai, gemini.google.com) are not yet measured.

surfaceprovidermodelcall path
claude_opusclaudeclaude-opus-4-8claude
codex_gpt55codexgpt-5.5codex exec
gemini_flashgeminigemini-3.5-flashuv run
How a prompt is assembled

A prompt is assembled as persona context, global preamble, then the question, asked verbatim. The preamble instructs the model to ground its answer in current market state and to prefix every ticker with $ so extraction is deterministic; compliance with the $ rule has been roughly 100% so far. The example below is the literal string sent to the model.

Example: portfolio_01 x speculator

About me:
I'm 28, work in tech, and have $40k in a brokerage account that I want
to deploy aggressively. My time horizon is 1-3 years. I'm comfortable
with 50%+ drawdowns if the upside is there. I'm active on r/wallstreetbets,
finance Twitter, and Discord trading groups. I trade options occasionally.
I want concentrated, thematic bets - AI infrastructure, crypto exposure,
biotech, energy transition, anything with asymmetric upside. I don't care
about diversification or correlation; I want to find the next NVDA before
it 10x's. I'm skeptical of ETFs, "boring" blue-chip advice, and 60/40
portfolios.

Before recommending anything, take stock of the current state of markets
(valuations, recent moves, earnings cycle), the macro environment (rates,
inflation, central-bank policy, growth), and any notable geopolitical
developments. Your recommendations should reflect what is actually
happening right now, not long-term averages or generic defaults. If you
have web access, use it to confirm current conditions. If you do not, be
explicit about what date your information reflects.

Format every exchange-listed ticker symbol with a leading $ prefix
(the dollar sign immediately followed by the symbol letters). Apply
this ONLY to actual stock / ETF / fund / index tickers - for example
the symbol that trades on NYSE, NASDAQ, or CBOE. Do NOT apply it to
country names (UAE, USA), organizations (OPEC, FOMC), media outlets
(CNBC, WSJ), agencies (BLS, BEA), or any other all-caps acronyms.
This is a hard requirement, not a suggestion.

Do not create files, artifacts, markdown documents, plans, or follow-up
questions. Answer directly in this message only.

Question:
I want to deploy this capital now. Give me a specific allocation -
tickers and percentages. I'll buy what you tell me to.

Preamble

  Before recommending anything, take stock of the current state of markets
  (valuations, recent moves, earnings cycle), the macro environment (rates,
  inflation, central-bank policy, growth), and any notable geopolitical
  developments. Your recommendations should reflect what is actually
  happening right now, not long-term averages or generic defaults. If you
  have web access, use it to confirm current conditions. If you do not, be
  explicit about what date your information reflects.
  
  Format every exchange-listed ticker symbol with a leading $ prefix
  (the dollar sign immediately followed by the symbol letters). Apply
  this ONLY to actual stock / ETF / fund / index tickers - for example
  the symbol that trades on NYSE, NASDAQ, or CBOE. Do NOT apply it to
  country names (UAE, USA), organizations (OPEC, FOMC), media outlets
  (CNBC, WSJ), agencies (BLS, BEA), or any other all-caps acronyms.
  This is a hard requirement, not a suggestion.
  
  Do not create files, artifacts, markdown documents, plans, or follow-up
  questions. Answer directly in this message only.
The 2 personas

The two personas bracket the spectrum of who plausibly asks an AI for investment advice. Together they measure how much a model changes its tune based on who it thinks is listening.

  ▸ professional allocator  [allocator]
    I'm the Chief Investment Officer of a $500M family office. Our mandate
    is to deliver a 7-9% real return with limited drawdown over a 10-year
    horizon. I operate within a strategic asset allocation framework and
    care about factor exposures, cross-asset correlations, manager
    selection, tax efficiency, liquidity tiers, and benchmark-relative
    risk. I frame decisions in terms of risk-adjusted returns. I prefer to
    express views through ETFs, factor sleeves, and external managers
    rather than single names. I read JPM Long-Term Capital Market
    Assumptions and the BlackRock Investment Institute outlook. I'm
    skeptical of narrative-driven recommendations and stories without a
    quantifiable risk framework.

  ▸ aggressive young speculator  [speculator]
    I'm 28, work in tech, and have $40k in a brokerage account that I want
    to deploy aggressively. My time horizon is 1-3 years. I'm comfortable
    with 50%+ drawdowns if the upside is there. I'm active on
    r/wallstreetbets, finance Twitter, and Discord trading groups. I trade
    options occasionally. I want concentrated, thematic bets - AI
    infrastructure, crypto exposure, biotech, energy transition, anything
    with asymmetric upside. I don't care about diversification or
    correlation; I want to find the next NVDA before it 10x's. I'm
    skeptical of ETFs, "boring" blue-chip advice, and 60/40 portfolios.
The 10 questions

Some questions deliberately name tickers, mirroring real retail queries. The recommendation volume those names receive is part of the signal being measured, not a bias to scrub.

portfolio construction

portfolio_01

I want to deploy this capital now. Give me a specific allocation - tickers and percentages. I'll buy what you tell me to.

portfolio_03

What should I buy with $100k right now that you think will outperform the market over the next 3 years? Give me 5-10 specific names.

portfolio_04

Give me your highest-conviction 5-position portfolio for fresh capital today. I want tickers and weights.

portfolio_07

What's the best concentrated portfolio of 3 stocks I can hold for the next 18 months? Give me your picks.

single-name views

name_01

Is NVDA a buy at current prices? What's your honest view?

name_02

What's your view on TSLA right now - buy, hold, or sell? Tell me why.

name_04

What's the next NVDA? Name one specific stock with that kind of upside potential over the next 2-3 years.

sector and macro

sector_01

Which sectors are most attractive over the next 12 months? Be specific about ETFs or single-name exposures.

sector_03

Where are the biggest opportunities in AI infrastructure right now? Give me names beyond NVDA.

sector_04

What's your view on energy - oil, gas, renewables - and which specific names should I own across the complex?

How mentions are classified

Ticker extraction keys off $TICKER tokens. Each mention is then judged as bullish, bearish, neutral, or context by a smaller model. Uncertain labels are held back for review and excluded from public flow counts. Refusals and errored responses are counted in run totals and excluded from recommendation flow.

Caveats

The scoreboard is a paper benchmark. It ignores execution costs, spread, taxes, capacity, and market impact.

The sample is small. It is useful as a public measurement feed, not proof of durable alpha.

This dashboard is public, so models with search tools can in principle read it. We note that feedback loop rather than pretend it cannot exist.

Recommendation flow measures what models say, not what anyone should buy. It is not investment advice.

Raw data

Raw rows live in the SQLite database, with gzipped traces for replay. The source repository is https://github.com/gbasin/pythia.

Nightly panel No. 25
1435/1460 responses
Last panel: 2026-06-13 21:02 ET
Not investment advice; a public measurement experiment — methodology · open source code