Scoring Myers-Briggs on the Phone: Why Psychometrics Should Be Private

Loxation Team

March 12, 2026 - 9 minutes read - 1747 words

Your personality type is nobody’s business but yours.

When you take a personality quiz online, your answers travel to a server. That server scores them, stores them, and — let’s be honest — probably sells the derived profile to an ad network before the results page finishes loading. The Cambridge Analytica scandal wasn’t about a data breach. It was about a personality quiz working exactly as designed: harvest psychometric data at scale, model voters, win elections.

We think there’s a better way. What if the quiz never left your phone?

The problem with cloud-scored surveys

A Myers-Briggs-style assessment collects 32 data points about how you think, socialize, make decisions, and organize your life. From those 32 answers, four scores are computed. From those four scores, one of sixteen personality types is assigned. The math is trivial — weighted sums and thresholds. A 1990s graphing calculator could do it.

Yet every major personality quiz platform sends your raw answers to a server. Why? Because the answers are the product. The personality type you get back is the receipt.

Even well-intentioned platforms create risk through aggregation. A database of personality profiles linked to email addresses is a targeting goldmine — for recruiters screening “wrong” personality types, insurers modeling risk, or state actors profiling dissidents. The safest database is one that doesn’t exist.

Surveys as data, not code

Our approach starts with a declarative survey format. A .survey file is a self-contained JSON document that describes everything needed to administer and score an assessment: questions, answer scales, scoring formulas, and the ontology that interprets the results. No server-side logic required.

Here’s the key insight: a scoring formula is just a dot product.

Take the Extraversion–Introversion dimension from a 32-question MBTI instrument. The classic formula looks like this:

IE = 30 - Q3 - Q7 - Q11 + Q15 - Q19 + Q23 + Q27 - Q31

That’s a constant (30) plus a weighted sum where each question contributes +1 or -1. Rewrite it as two vectors:

weights:  [0, 0, -1, 0, 0, 0, -1, 0, 0, 0, -1, 0, 0, 0, +1, 0,
           0, 0, -1, 0, 0, 0, +1, 0, 0, 0, +1, 0, 0, 0, -1, 0]

answers:  [3, 2, 4, 1, 5, 2, 3, 4, 1, 2, 5, 3, 4, 2, 1, 3,
           2, 4, 3, 5, 1, 2, 4, 3, 2, 1, 3, 4, 5, 2, 3, 1]

IE = 30 + dot(weights, answers)

Every psychometric formula we’ve encountered — MBTI, Big Five factor scoring, Holland Code occupational themes — reduces to this: a constant plus an inner product. The formula is data. It travels inside the survey definition as a weight vector, not as executable code that needs a server to run.

This matters because it means adding a new survey instrument requires zero platform code changes. Design the questions, define the weight vectors, write the ontology, publish the .survey file. The phone already knows how to score it.

The graph layer: Rukuzu

On the device, survey formulas are stored as nodes in a local graph database called Rukuzu. Each Formula node carries its weight vector, threshold, score range, and the names of the two poles it discriminates between:

Formula {
    dimension: "IE",
    weights: [-1, 0, +1, ...],   // 32-element vector
    constant: 30.0,
    threshold: 24,
    positive_class: "Extravert",
    negative_class: "Introvert"
}

When you complete a survey, your answer vector is persisted on a graph edge. Scoring is a single Cypher query that runs entirely on-device:

MATCH (f:Formula)-[:SCORES]->(s:Survey {survey_id: $sid})
WITH f,
     f.constant + array_inner_product(f.weights, $answers) AS score
RETURN f.dimension,
       CASE WHEN score > f.threshold
            THEN f.positive_class
            ELSE f.negative_class END AS pole,
       CASE WHEN score > f.threshold
            THEN (score - f.threshold) / (f.range_max - f.threshold)
            ELSE (f.threshold - score) / (f.threshold - f.range_min)
       END AS degree

Rukuzu is a from-scratch Rust graph database with SIMD-accelerated vector operations — NEON on Apple Silicon, AVX2 on Intel. A 32-element dot product completes in nanoseconds. But the performance isn’t really the point. The point is that the query runs locally. Your answer vector never touches a network socket.

The degree value (0.0 to 1.0) captures something most personality quizzes throw away: how strongly you lean toward a pole. Someone who scores 38 on the IE scale isn’t the same kind of extravert as someone who scores 25. Classical MBTI collapses both to “E.” We preserve the gradient.

The reasoning layer: Dealer

Raw scores are necessary but not sufficient. “IE = 28, SN = 19, FT = 31, JP = 22” is precise, but it doesn’t mean anything until it’s interpreted in the context of a classification system.

This is where Dealer comes in — a fuzzy Description Logic (EL++) reasoner that runs on-device alongside Rukuzu. Dealer works with OWL 2 ontologies: formal descriptions of concepts and how they relate.

The MBTI ontology declares eight dimension poles as classes, constrains them as disjoint pairs (you can’t be both Extravert and Introvert), and defines sixteen type classes as intersections:

DisjointClasses(mbti:Extravert mbti:Introvert)
DisjointClasses(mbti:Sensing   mbti:Intuitive)

SubClassOf(
  ObjectIntersectionOf(
    mbti:Introvert mbti:Intuitive
    mbti:Thinking  mbti:Judging)
  mbti:INTJ)

AnnotationAssertion(mbti:label mbti:INTJ "The Architect")

For each dimension, Rukuzu produces a pole class and a degree. Dealer receives these as fuzzy class assertions — “this individual is Introvert with degree 0.82” — and runs subsumption. Under Zadeh semantics, the degree of an intersection is the minimum of its components. An INTJ with dimensions at (0.82, 0.91, 0.67, 0.88) has a type degree of 0.67 — the weakest link. That single number tells you: this person’s type is fairly clear, but the Thinking dimension is where they’re most borderline.

This two-step pipeline — Rukuzu computes, Dealer reasons — means the scoring engine and the classification logic are independent. You could swap in a Big Five ontology with five continuous dimensions and no discrete types. Or a clinical instrument with hierarchical diagnostic criteria. Rukuzu doesn’t care what the vectors mean. Dealer doesn’t care where the numbers came from.

The interplay: signals, not scores

We call the bridge between surveys and ontologies a signal mapping. Each mapping declares: “this survey question (or formula) produces a signal that should be asserted as a class membership in a given ontology.”

Some mappings are direct. A yes/no question about whether you prefer working alone maps straight to a class assertion. Others are formulas — the weighted sums described above. A few are option-based: pick “justice” and you’re asserting one class, pick “compassion” and you’re asserting another.

What makes this interesting is that the same signal infrastructure handles personality surveys, trust scoring, and safety analysis. A personality ontology classifies you as INTJ. A trust ontology classifies a peer contact as HighTrust based on interaction signals gathered from the graph. Different ontologies, different signal sources, same reasoning engine, same phone.

This convergence isn’t accidental. Personality research and trust modeling share a common structure: observe discrete behaviors, aggregate them into continuous dimensions, classify the dimensions into categories. The math is identical. Only the semantics differ. By treating both as ontology problems, we avoid building two separate systems that do the same thing with different bugs.

What stays on the phone

Let’s trace what happens when you complete a personality survey in our system:

Questions render from local data. The .survey file was downloaded once and cached. No server call per question.
Answers stay in local memory during the session. They’re written to an encrypted database (SQLCipher) only on completion.
Scoring runs as a vector dot product inside Rukuzu. The answer vector and weight vectors never leave the process.
Classification runs through Dealer’s reasoner — a fuzzy EL++ engine compiled into the app binary. No API call.
Results are stored on a graph edge linking your device to the survey. The personality type, dimension degrees, and answer vector are encrypted at rest.
If you share results with a group, only the classified type and degrees travel over the wire — not your raw answers. Sent through an MLS-encrypted group channel. End-to-end. No server can read it.
If you delete the survey, the answer vector, scores, and classification are wiped from the graph. There is no server-side copy to forget to delete.

Compare this to the cloud model: answers transmitted in cleartext, stored indefinitely, scored on infrastructure you don’t control, retained after you delete your account because backups, sold to third parties under a privacy policy you didn’t read.

The hard part isn’t the math

Implementing on-device psychometric scoring is straightforward. The hard part is making the system extensible without making it programmable. A .survey file that contains arbitrary code is a malware vector. A .survey file that contains only data — questions, weights, ontology references — is inert.

That’s why formulas are weight vectors, not expressions. A weight vector can be validated: is it the right length? Are the values in a reasonable range? It can’t execute code, access the filesystem, or phone home. The most a malicious weight vector can do is produce wrong personality scores, which is what most online quizzes do anyway.

The ontology layer provides a second guardrail. Even if someone crafts a survey with adversarial weight vectors designed to always classify users as a particular type, the ontology’s disjointness axioms and degree calculations will produce suspicious results — degrees clustered at 0.0 or 1.0, dimensions that don’t vary across users. These patterns are detectable.

Why this matters beyond personality quizzes

Personality surveys are our starting point, not our destination. The same architecture supports:

Mental health screenings (PHQ-9, GAD-7) where privacy isn’t a feature but a clinical requirement
Learning style assessments that help study groups form without a platform profiling students
Team compatibility surveys shared within encrypted group channels — your employer sees the team dynamics report, not your individual answers
Research instruments where participants can contribute aggregate statistics (ontology classifications) to a study without exposing raw responses

In each case, the principle is the same: compute locally, reason locally, share only the conclusions you choose, encrypted to the people you choose.

The phone in your pocket has more compute power than the machines that originally scored these instruments. There is no technical reason your personality profile needs to exist on someone else’s server. The only reason it does is because the business model requires it.

We’d rather build a different business model.

Loxation is a peer-to-peer messaging app that operates over Bluetooth mesh networks with end-to-end encryption. The survey system described here uses Rukuzu, an on-device graph database, and Dealer, a fuzzy OWL 2 EL++ reasoner — both designed to run where your data lives: on your device.