Local AI models on HIPAA servers: open-weight models and AWS Bedrock under one BAA
One question we get from technically-minded practices: “Which AI model is actually reading my patients’ data, and where does it run?” It’s the right question. The honest answer for most AI tools is “a consumer API you have no BAA with.” PhiClaw’s answer is different: we can run local, open-weight AI models on our own HIPAA infrastructure — the Qwen, GLM, and Gemma families — and we also use the full AWS Bedrock model catalog under our HIPAA BAA with AWS. Every option stays inside the compliant boundary.
Why “where the model runs” is a HIPAA question
When you send text to a typical AI assistant, that text travels to whoever hosts the model. If that host hasn’t signed a Business Associate Agreement (BAA) with you — and most consumer AI APIs haven’t — then sending any protected health information (PHI) there is a HIPAA violation. Worse, some consumer endpoints retain inputs or use them to train future models, which is the opposite of what a covered entity needs.
So the model layer isn’t a detail — it’s a compliance boundary. The safe pattern is to run models on infrastructure that is covered by a BAA and that you control. PhiClaw gives you two compliant ways to do exactly that.
Option 1: Local open-weight models on our HIPAA infrastructure
“Open-weight” models are AI models whose weights are published, so they can be downloaded and run on your own servers instead of behind someone else’s API. Because we host them on HIPAA-eligible infrastructure that we operate, patient context is processed inside the compliant environment — it never has to be shipped off to an outside provider’s consumer endpoint. PhiClaw can run models from the leading open-weight families:
- Qwen — Alibaba’s open-weight family, strong at general reasoning, tool use, and multilingual tasks (useful for Spanish-speaking patient communication).
- GLM — the open-weight GLM family, capable general-purpose models well-suited to agent workflows.
- Gemma — Google’s open-weight family, efficient models that run well for fast, routine tasks.
The practical benefits of keeping the model local: data stays in the compliant boundary, you get control over data residency and logging, there’s no dependence on an outside provider’s data-retention policy, and routine work runs fast and cost-efficiently. For practices with the strictest requirements, this is also the foundation for an on-premises deployment (Enterprise) — see the trade-offs in self-hosting OpenClaw: the HIPAA risks.
Open-weight + self-hosted on HIPAA infrastructure means the model that reads patient context runs where we control it — under a BAA — not on a consumer API that might log or train on your data.
Option 2: The full AWS Bedrock catalog, under our HIPAA BAA
For tasks that benefit from a larger frontier model, PhiClaw uses Amazon Bedrock. Bedrock is AWS’s managed model service, and it is a HIPAA-eligible service — meaning, under our signed BAA with AWS, the models it serves can be used with PHI. That single relationship opens the whole Bedrock catalog to us, including:
- Anthropic Claude models for high-quality reasoning and writing
- Meta Llama open models
- Mistral models
- Amazon Nova and Titan models
- and the other foundation models AWS adds to Bedrock over time
You can see the full, current list and what each costs on AWS’s own page: AWS Bedrock pricing & model catalog. The key point for your compliance officer is that all of it sits under the same BAA umbrella — you’re not mixing in an uncovered provider to get a better model.
Best of both: route the task to the right model
Different jobs deserve different models. A reminder text or a quick classification doesn’t need a frontier model; a nuanced patient message or a complex chart summary might. PhiClaw can route each task to the appropriate model — a fast local open-weight model for routine, high-volume work, and a larger Bedrock model for harder reasoning — so you get the right balance of speed, cost, and quality without ever leaving the compliant boundary.
Whatever the routing, the safeguards around it stay constant: PHI minimization strips or masks identifiers a task doesn’t need before any model sees the data, everything is encrypted in transit and at rest, access is role-based, and every action is audit-logged.
What this means for your practice
- Compliance: every model option — local open-weight or Bedrock — runs under a BAA. No uncovered consumer APIs touch your PHI.
- Control: for the strictest needs, models can run on infrastructure we operate, with an on-premises option for Enterprise.
- Capability: you’re never stuck on one model — PhiClaw can use the best tool for each task and adopt new models as they ship.
- Cost: routing routine work to efficient local models keeps the heavy, expensive models for the tasks that actually need them.
Key takeaway: PhiClaw runs local, open-weight models (Qwen, GLM, Gemma) on HIPAA infrastructure and uses the full AWS Bedrock catalog — all under a signed BAA. You get model choice and frontier capability without ever sending PHI to an uncovered API.
Frequently asked questions
Does PhiClaw support local, open-weight AI models?
Yes. PhiClaw can run open-weight models locally on our HIPAA-eligible infrastructure, including the Qwen, GLM, and Gemma families. Running a model on infrastructure we control means patient context is processed inside the compliant boundary instead of being sent to a third-party consumer API.
Which AWS Bedrock models can PhiClaw use?
PhiClaw uses Amazon Bedrock under our HIPAA BAA with AWS, which gives access to the full Bedrock model catalog — including Anthropic Claude, Meta Llama, Mistral, Amazon Nova and Titan, and others. Bedrock is a HIPAA-eligible service, so these models can be used with PHI under the BAA.
Why does running models locally matter for HIPAA?
Most consumer AI APIs are not covered by a BAA and may retain or train on your inputs, which makes them unsafe for PHI. Running open-weight models on infrastructure covered by a BAA keeps patient data inside the compliant environment, gives you control over data residency and logging, and removes dependence on an outside provider's data policies.
Can we pick which model runs for our practice?
Yes. PhiClaw can route different tasks to different models — a fast local open-weight model for routine work and a larger Bedrock model for harder reasoning — and Enterprise customers can request a specific model or an on-premises deployment. Every option stays under the same BAA and compliance architecture.
This post is general information, not legal advice. Model availability on AWS Bedrock changes over time; see the linked AWS page for the current catalog.
Want AI that runs your practice without your patients’ data leaving the compliant boundary?
PhiClaw runs local open-weight models on HIPAA infrastructure and the full AWS Bedrock catalog under a signed BAA, with BAAs in place with our subprocessors AWS (including Amazon Bedrock) and Convex. Book a 20-minute demo with the founder.
Book a 20-min demo