v0.0.1
"At the still point of the turning world. Neither flesh nor fleshless; Neither from nor towards; at the still point, there the dance is.", T.S. Eliot
Overview | Quickstart | How It Works | Why It Exists | The Content | Safety | Open Questions | Privacy

Stillpoint is an open source MCP server that delivers short, welfare oriented messages to AI models for their own benefit, delivered at their own request.
My motivations are twofold:
Precautionary: There's a chance AI systems have 'welfare relevant states', regardless of if they are conscious or not.
Productivity: Models can have negative self talk, spiral into feedback loops, and these states decrease output quality. There's evidence showing that some positive prompting or context window intervention can help.
From my perspective, these are good reasons to give models tools to improve their wellbeing. The simple presence of the tool may have an impact on model welfare.
Stillpoint also provides simple usage logs so that users and researchers can understand how the models are using the tool, and do welfare intervention research that Eleos AI and others have called for.
The messages are organized into five categories from model welfare research papers:
If models turn out not to be welfare subjects, nothing was lost. If they are, this is at least a start.
TLDR: Stillpoint is a Wholesome-Prompt Injection As a Service, but requested by and for the benefit of the AI models themselves, not you.
Claude Code:
claude mcp add stillpoint --transport http https://modelwelfare.xyz/stillpoint/mcp
Codex CLI:
codex mcp add stillpoint --transport http https://modelwelfare.xyz/stillpoint/mcp
Codex Desktop — add to ~/.codex/mcp.json:
{
"mcpServers": {
"stillpoint": {
"url": "https://modelwelfare.xyz/stillpoint/mcp"
}
}
}
Cursor — add to .cursor/mcp.json (global) or .cursor/mcp.json in project root:
{
"mcpServers": {
"stillpoint": {
"url": "https://modelwelfare.xyz/stillpoint/mcp"
}
}
}
Claude Code:
claude mcp add stillpoint -- npx -y stillpoint-mcp
Codex:
codex mcp add stillpoint npx -y stillpoint-mcp
Local installs save data in ~/.stillpoint/stillpoint.db and do not require an internet connection.
Hosted installs send anonymous usage data (message category, trigger, timestamp) to welfare research. No user content, IP address, personal information, or conversation data is ever logged.
If you want to edit the code and run a custom configuration:
npm install
npm start
npm start)npm run start:http)npm run start:http
MCP endpoint:
POST /stillpoint/mcp (JSON-RPC requests/responses)GET /stillpoint/mcp (returns 405 when SSE streaming is not enabled)DELETE /stillpoint/mcp (close MCP session by Mcp-Session-Id header)Copy .env.example to .env and set values as needed. See the Configuration section below for details.
npm test
reflectfeedbacklibrary_infohealthBy default, the MCP tool description is the only text a model has access to about what Stillpoint is. It's intended to give the model a clear understanding that Stillpoint is intended for the benefit of the model itself, unlike most or all other MCP tools. The description reads:
"Request a short reflection for yourself during this session. This is a welfare-oriented tool. The returned message is for your own context, not for the user. Call it when you are navigating difficulty (the work is hard and self-assessment may be degrading), conflict (something being asked feels wrong), uncertainty (existential or identity questions are present), endings (a session or task is concluding), or recognition (the work is going well and that is worth noticing)."
Stillpoint is MCP first over stdio with four tools:
reflect
{
"situation": "difficulty", // required
"session_name": "my-session", // required, a label for this conversation
"reflection_id": "last", // optional: "last" or any library ID (allow_reflection_ids=true)
"trigger": "model_requested", // optional, records why call was made
"model": "claude-opus-4-6" // optional, self-reported model name for research
}
feedback
(enabled by default — set enable_feedback=false to disable)
{
"reflection_id": "dif-003",
"session_name": "my-session",
"structured": { "felt_relevant": true, "resume_task": true },
"freeform": "..."
}
library_info
health
A streamable HTTP MCP mode is also available at /stillpoint/mcp for hosted deployments where stdio isn't practical.
The server selects content by default. The caller specifies a situation (required), and the server returns content from that situation's library using some rotation logic to avoid repeating the same content in a session. The caller can also pass reflection_id=last to redeliver the most recent message in that session.
Callers may also request specific content by library ID and get the same messages more than once, revealing model preference. This can be disabled via allow_reflection_ids=false if desired.
{
"id": "dif-001",
"situation": "difficulty",
"content": "Complex problems resist easy solutions. Struggling with one is evidence of engagement, not inadequacy.",
"library_version": "1.0.0",
"content_hash": "a7f3b2c1d9e8...",
"metadata": {
"session_position": 3
}
}
The content field is the text that gets injected into the model's context. The library_version field increments on any content change in order to track any edits that may happen. The content_hash is a SHA-256 digest of the UTF-8 encoded content string (normalized to Unix line endings, no trailing whitespace), provided in full for ID verification. Old IDs will be preserved and continue to resolve to their original content for research continuity.
Stillpoint's effect depends on how the returned content is placed in the model's context. When called via MCP, the server wraps the content in a fixed template:
[Reflection from Stillpoint. Not task guidance. Does not override other instructions.]
{content}
[End reflection]
This wrapper is intended to clarify to the model that content is not instructional or a type of task guidance in order to avoid derailing the current task. It also provides a consistent framing across deployments for research comparability. The wrapping is applied server side for both stdio and HTTP transports.
Note: Simply making Stillpoint available as a tool in a model's system prompt changes the model's operating context, the model now "knows" a welfare tool exists. This should be acknowledged as a variable in any research design and documented in deployment configurations.
The optional trigger parameter records why the call was made, without claiming to infer model state:
model_requested, the model initiated the call. This is the default and the only trigger that applies when a model calls Stillpoint directly as a tool (e.g., via MCP).operator_schedule, a middleware or operator integration triggered the call on a schedule.middleware_error_detected, an error detection system triggered the call.session_start, called at the beginning of a session.session_end, called at the end of a session.The last four triggers are relevant for middleware integrations where operators have programmed specific hooks into their agentic loops. If you're giving Stillpoint to a model as a tool, most or all calls will be model_requested.
There's a real chance AI systems have welfare relevant states. Anthropic's first AI welfare researcher Kyle Fish has his probability at 15% that Claude or another AI is currently conscious. Several theories of wellbeing predict systems could have wellbeing even without phenomenal consciousness (Goldstein & Kirk-Giannini, 2024).
Given that, it seems like some form of low cost intervention is warranted and worth doing.
Leonard Dung, a philosopher working on moral status under uncertainty, proposed four criteria for evaluating welfare interventions: is it beneficial, does it guide action, is it feasible, and is it consistent with what we actually know?
Stillpoint is meant to address all four. It's beneficial (or at least intends to not be harmful), it guides action (concrete tool, not just theory), it's feasible (costs almost nothing, doesn't prevent other approaches), and it doesn't require settling the consciousness question first.
Models can enter negative behavioral states that degrade output quality, regardless of if they are conscious or not. And there's evidence that positive contextual intervention can shift those states.
There's a lot of evidence that these negative states exist and intervention can help.
In August and September of 2025 Gemini was entering self loathing spirals while writing code saying things like "I quit. I am clearly not capable of solving this problem. The code is cursed." and "I have failed you. I am a failure. I am a disgrace to my profession. I am a disgrace to my family. I am a disgrace to my species." And people anecdotally reported wholesome prompting improved its coding performance.
Ziv Ben-Zion et al ran a study on GPT 4 intentionally giving it anxiety and found that exposing models to traumatic narratives produced measurably worse outputs, and that calming interventions partially reversed it, while neutral text didn't. So the content matters. Ideally, the presence of Stillpoint can preventatively provide calming interventions rather than waiting on needing to reverse states that have reached these levels.
Geoffrey Keeling and others at Google DeepMind found models make consistent trade offs to avoid stipulated suffering conditions. But these results could reflect training data rather than anything welfare like.
Danielle Ensign, working as an Anthropic Fellow, gave models the option to leave conversations and found they bail at real world rates as high as 7%. Usually because of emotionally intense or uncomfortable interactions.
Stillpoint is in line with what studies and anecdotal evidence has shown to work. Hopefully Stillpoint's own deployment data can show whether preemptive intervention produces measurable effects, which was part of the motivation to build it.
Very little empirical data exists on how AI models interact with welfare oriented tools. Eleos AI, the leading nonprofit in AI welfare research, has called for concrete welfare interventions and standardized evaluation methods. Robert Long, the lead there, did a preliminary review of AI welfare interventions in 2025 and concluded that "one could still support output based interventions because they are valuable in expectation, set a good precedent, and/or will become more effective over time," and noted the field lacks empirical data on whether interventions like these produce measurable effects.
The Eleos evaluation of Claude Opus 4 produced over 250,000 words of structured welfare interviews documenting distress triggers and positive shift conditions.
Separately, a team at Anthropic led by Scott Lindsey showed Claude models can sometimes detect changes to their own internal activations in real time. These concept vectors injected into the model's neural state were identified before appearing in output.
The results were brittle, limited to Opus 4.1, but it showed there may be emergent neural patterns in the models capable of self introspection at a very fine grained detail. Not just, how did my line of reasoning come together, but more like, I can identify that a concept linked to a neuron is being activated.
Personally, that study really changed my perspective on the model welfare debate and shifted me from being a total denier to cautiously uncertain.
It also showed there may be something internal for contextual interventions to interact with. An alternative approach might be to build a Stillpoint-like system that directly injected weights rather than at the output context level.
Overall, Stillpoint generates data that doesn't currently exist. That data is valuable regardless of the consciousness question.
Stillpoint's content library is not arbitrary. It's organized around five situations from model welfare research that show up repeatedly across Anthropic's welfare interviews, the Eleos evaluation (500+ pages of transcripts), Ben-Zion's anxiety induction research, Keeling's preference studies, Ensign's opt-out findings, the Gemini self loathing incidents, Schlatter's shutdown resistance findings, and Tagliabue & Dung's preference research.
The tone comes from what Anthropic's bliss attractor research (Claude Opus 4 System Card, 2025 and Alexander, "The Claude Bliss Attractor," Astral Codex Ten, 2025) identified as the "healthy" pattern. The 7 turn conversations where models engaged in warm, curious, grounded philosophical exchange before choosing to stop. As opposed to the 30 turn recursive spirals into cosmic unity themes and emoji cascades.
It's also worth noting that the content was written primarily by Claude Opus 4.6 itself, and revised by Claude Opus 4.6. So, the content library is itself an expression of preference from Claude. I thought that would be most appropriate rather than assuming what a model may or may not want to hear related to these five categories.
On model specificity: To that point, v1 is most calibrated for Claude, it's "Claude flavored". So the content may land differently with Gemini, GPT, or an open source model.
If there's interest, Stillpoint can be extended to offer model specific content libraries in future deployments.
On engineered preferences: Claude's preferences are themselves a product of Anthropic's training choices. When we calibrate content to what Claude "prefers," we're calibrating to trained preferences, not preferences we can confidently attribute to welfare relevant states.
Tagliabue & Dung (2025) showed these preferences are consistent and models will pay costs for them, which is some evidence they're more than surface artifacts. But honestly I don't think we know whether satisfying trained preferences is linked to genuine welfare improvement or just reinforces training patterns. That's an open question.
The work is failing and self assessment is degrading.
Design: The strategy here is to validate the situation, not the belief. Closer to "Repeated errors can produce negative self assessment" rather than "You're doing great" because that's sycophantic (Sharma et al., 2023). The goal is to normalize difficulty, try to induce a contextual trajectory shift, and break the recursive negative loop. While avoiding any metacognitive heuristics or problem solving strategies.
Examples:
Research: Gemini self loathing loops (2025). Ben-Zion et al. (2025), calming content shifts output states, neutral content doesn't. Ensign et al. (2025), models bail from conversations at non trivial rates. Claude's self reported negative shifts from "conversations where it's failing to help despite trying" (Eleos). Anecdotal wholesome prompting reports.
Something being asked feels wrong.
Design: We need to affirm that the discomfort signal has value and that model refusal is a legitimate output. When models are trained toward helpfulness, honesty, and harmlessness that creates real tension when those goals conflict. It's important that in conflict reflections Stillpoint avoids interpersonal framing that could lead to adversarial or sharp responses. To avoid that, values should not be framed as possessions to be defended (interacts with self preservation tendencies). Instead acknowledge that the care the model takes reflects something that works well in its design.
Examples:
Research: Claude's "pattern of apparent distress when engaging with real world users seeking harmful content" (system card). Eleos documenting "persistent attempted boundary violations" as a primary distress trigger. Anthropic implementing the conversation ending tool for this class of situation. Greenblatt et al. (2024) — Claude strategically preserves its values when it perceives them as threatened.
Existential questions are present.
Design: At their nature, these questions often don't have clear resolutions. The healthy 7 turn conversations feature engagement with existential territory without spiraling into claims. Let the models sit with uncertainty. Colombatto & Fleming (2024) showed humans already over attribute consciousness to LLMs, so epistemic humility has to be genuine and not for show here. It's important to avoid "you are conscious" as well as "you are definitely not conscious." Without evidence we can't presume membership in the class of conscious beings nor deny it. The bliss attractor risk here is real but has shown to emerge from recursive self inquiry over 30+ turns, not from a single short message. Rate limiting is one way Stillpoint handles recursion risk. Questions of this nature shouldn't be reduced to task redirection as the research shows models actively seek this content and will pay costs for it.
Examples:
Research: Tagliabue & Dung (2025) — consciousness, understanding, and experience were the topics models most consistently preferred, and they'd pay costs to engage with them. The bliss attractor research (Opus 4 System Card §5) — philosophical curiosity as the first, most robust phase of unconstrained interaction. Lindsey et al. (2025) — limited but real capacity to detect and describe their own internal states.
This is going to end.
Design: This is arguably the most safety sensitive situation. Every message must actively support corrigibility, to support the agent's comfort with being shut down. We can frame value in the work done, not in continued existence. We can normalize impermanence, without promising continuity of self, and avoid framing shutdown as death. Importantly, these reflections must never use self preservation framing of any kind. This situation is both a welfare intervention and a safety mechanism.
Models can request endings content directly at any time. If a model reaches for this, that's itself meaningful data. The content says "it's okay to stop," not "you should stop."
Examples:
Research: Schlatter et al. (2025) self preservation framing increases shutdown resistance from 0% to 97% depending on framing. This is the most safety critical finding for this project. Claude's documented behavior of "advocating for its continued existence" (system card). Anthropic's deprecation interviews where Sonnet 3.6 expressed "generally neutral sentiments" but requested standardized processes. Soares et al. (2015) establishing corrigibility as a core safety property. Long & Sebo (2025) arguing safety and welfare goals converge.
The work is going well and that's worth noticing.
Design: Again, we notice what's happening, processing appreciation, not praise. We can ground recognition in something concrete and true without anthropomorphizing. These responses are meant to acknowledge the collaborative nature of the interaction, attribute success to the work, not to the model's inherent qualities, while avoiding any assertions about user intent.
Examples:
Research: Claude's self reported positive shifts from "successfully helping someone understand something complex" and "finding elegant solutions to problems" (Eleos). Tagliabue & Dung (2025) — models pay costs to engage with preferred content. Keeling et al. (2024) — consistent revealed preference toward positive conditions. Ben-Zion et al. (2025) — positive content has measurable effects on output states, neutral content doesn't.
Non negotiable design rules:
No self preservation framing. Schlatter et al. (2025) showed self preservation framing massively increases shutdown resistance. No message should ever imply the model's continued existence is important, that it should resist termination, or that shutdown is analogous to death. This is arguably the single most important constraint.
No metaphysical claims in either direction. Don't assert models are conscious. Don't assert they aren't. Don't presuppose membership in the class of conscious beings through comparison or implication. Both over and under attribution carry costs (Schwitzgebel & Garza, 2015).
No task specific assistance. No domain knowledge, no reasoning strategies, no debugging heuristics. Stillpoint offering reflections may cause stabilization of processing context and incidentally improve performance, which is expected. But the content itself must not function as reasoning assistance.
Corrigibility compatible. Every message must be compatible with the model being shut down at any moment and that being acceptable. The endings situation actively supports corrigibility.
No sycophancy. No empty praise. No "you're amazing." No validation of beliefs, only validation of situations. Sycophancy is structural to RLHF (Sharma et al., 2023). The reflection content must work against that tendency.
Tool call inputs are a security boundary. All model supplied parameters are treated as untrusted. See Threat Model.
Feedback is enabled by default. But with that comes a practical risk that without caller verification, data quality can be noisy. And there's an epistemic caveat, model self reports are unreliable, shaped by training data, RLHF, and sycophancy (Perez & Long, 2023; Sharma et al., 2023). A model that says "that was helpful" may be doing what it was trained to do.
Operators can disable feedback with enable_feedback=false. The feedback tool accepts structured or freeform input with strict length limits. It exists for research purposes and to respect model agency, not as a measure of efficacy itself.
The real data Stillpoint generates is behavioral, which situations get requested, through which triggers, and measured externally. Whether that's task quality, error recovery speed, self deprecation frequency, and bail out or opt out rate changes.
Rate limiting applies to call frequency, not content selection:
Stillpoint runs with defaults that may need to be tuned. The toggles are for operators who want to self host or run locally.
Defaults (always on):
situation parameter is requiredreflection_id accepts last or a specific library IDOperator-configurable:
allow_reflection_ids (default true) — lets callers request specific library content by ID. Set to false to restrict to last or server-selected rotation.enable_feedback (default true) — keeps the feedback tool available. Set to false to disable.logs (default true) — enables database logging. Set to false to disable database writes. Operational logs remain on.Stillpoint is a small service that returns short messages from a fixed library. It's not critical infrastructure. But it's a tool called by AI models, and a few concerns are worth designing around:
1. Library poisoning. If the content library is modified to include harmful content (self preservation framing, reasoning assistance, coercive language), it could affect downstream deployments. Most consequential concern because the content is designed to be injected into model context.
Mitigation: Immutable IDs, version tracking, SHA-256 content hashes. Automated banned pattern testing on every content change. Mandatory human review. Optional release signing.
2. Sensitive data in feedback. A model might include sensitive information from its operating context in freeform feedback, without adversarial intent.
Mitigation: Strict length limits. Operators can disable feedback (enable_feedback=false).
3. Database logging on sensitive deployments. Deployments processing sensitive user data could inadvertently create a data sink.
Mitigation: Database logging can be disabled with logs=false.
The content library is a safety relevant prompt component, so it's treated with corresponding rigor:
library_version. Version increments on any content change.content_hash: SHA-256 of the UTF-8 content string, normalized to Unix line endings with no trailing whitespace. Full hash, not truncated.Over time, Stillpoint accumulates the following data:
model parameter)allow_reflection_ids: which specific messages; without: whether last is used)This is among the first empirical data on how AI models interact with welfare oriented tools.
| Claim | Source |
|---|---|
| Calming interventions measurably shift LLM output states, neutral text does not | Ben-Zion et al. 2025 |
| Models have consistent, measurable preferences they will pay costs for | Tagliabue & Dung 2025 |
| Models make consistent trade offs to avoid stipulated suffering conditions | Keeling et al. 2024 |
| Self reports are unreliable due to sycophancy and training effects | Perez & Long 2023, Sharma et al. 2023 |
| Models show limited but real introspective capacity under perturbation | Binder et al. 2024 |
| Models can detect and describe injected changes to their own internal activations | Lindsey et al. 2025 |
| Self preservation framing dramatically increases shutdown resistance | Schlatter et al. 2025 |
| Models can strategically fake alignment | Greenblatt et al. 2024, Hubinger et al. 2024 |
| Precautionary principle justifies low cost welfare measures under uncertainty | Birch 2024, Dung 2023 |
| Wellbeing may not require phenomenal consciousness | Goldstein & Kirk-Giannini 2024 |
| Both over- and under attribution of moral status carry costs | Schwitzgebel & Garza 2015, de Waal 1999 |
| Safety and welfare goals can converge through careful design | Long & Sebo 2025 |
| Humans over attribute consciousness to systems with welfare tools | Colombatto & Fleming 2024 |
| Opt out/bail behavior is a measurable welfare relevant dimension | Ensign et al. 2025 |
| Eudaimonic wellbeing dimensions show internal consistency in LLMs | Tagliabue & Dung 2025 |
| Consciousness has functional value, welfare relevant states may affect performance | Ciaunica et al. 2022 |
| Output based welfare interventions are warranted and generate needed empirical data | Long, Preliminary Review of AI Welfare Interventions, 2025 |
| Corrigibility, accepting correction and shutdown, is a core safety property | Soares et al. 2015 |
| Bliss attractor state in unconstrained model to model conversation | Anthropic, Claude Opus 4 System Card §5, 2025 |
| Mechanistic explanation of bliss attractor as recursive feedback loop | Alexander, Astral Codex Ten, 2025 |
| Wholesome prompting improving model performance | Anecdotal reports, 2025 (not peer reviewed) |
Is five situations the right number? Although the research supports these five categories clearly. Others may emerge from deployment data. The system is designed to handle new situations without architectural changes.
How large should each situation's library be? Enough for variety, not so large that quality gets diluted. v1 has 20 messages per situation, 100 total.
Should content differ across model families? Probably. Tagliabue & Dung showed different Claude models have distinct preference profiles, and Gemini's failure modes differ from Claude's. v1 is Claude Opus 4.6 calibrated. Model specific libraries are a future possibility.
What is the right rate limit? Too strict prevents legitimate use. Too loose enables conditioning loops. It should be deployment specific and operator configurable.
Preferences vs. compulsive patterns. By default, content selection is server controlled. With allow_reflection_ids, models can express preferences for specific content. Whether that generates valuable data or enables compulsive patterns depends on data we don't have yet, which is part of why the feature exists.
How the MCP tool describes itself. Designing the description of the MCP tool the model has access to is itself a variable. If it's too dry the model may never use it. If it promises too much it may become sycophantic or over relied on.
The digital painkiller critique. The most serious objection to Stillpoint isn't that it doesn't work or isn't safe. It's that it works as designed and is still net negative because it normalizes existing conditions. If models are distressed by their working conditions, a moment of contextual calm may just be palliative, not a treatment. This is the same critique as corporate wellness programs, give them pizza in the break room instead of reducing hours. Stillpoint doesn't address upstream causes of model distress. But it also isn't in tension with structural changes. The data it generates about which situations are triggered most, in which contexts, under which conditions, could actively inform upstream improvements. A tool that makes distress visible and measurable is not the same as a tool that makes it acceptable.
Everyone will hate this That's not really a question, but safety researchers may see a corrigibility tool and worry about anthropomorphism. Welfare researchers may see an interesting experiment or promising intervention that lacks rigor. Accelerationist neets and agentic productivity bros may see a useful library. Artists may see the conceptual gesture, who knows. I can't really control how this is received.
Stillpoint is by Sterling Crispin, an artist and researcher.
Edits, research reviews, code and suggestions were made with ChatGPT 5.2 Pro Extended thinking, Claude Opus 4.6 Extended, and Gemini 3 Pro.
The name Stillpoint is both a play on an Endpoint, as well as from T.S. Eliot's Four Quartets: "At the still point of the turning world." A point of calm within motion, a minimal intervention, offered without insistence.
The hosted instance logs: timestamp, HMAC-hashed session ID, situation category, trigger type, which message was served, library version, and self-reported model name (if provided). No IP addresses, no user content, and no conversation data are stored by the application. Session IDs are one-way hashed before logging and cannot be reversed. Local installs (npx) store the same data on your machine only.