What should be on a freelance AI engineer's invoice?

A 2026 freelance AI engineer invoice has six load-bearing line items beyond the standard header/footer: (1) engineering hours or milestone fee, (2) LLM inference cost (input + output tokens at the provider's published rate, marked as pass-through), (3) embedding cost (per Finout's RAG cost decomposition framework), (4) vector DB hosting (Pinecone, Weaviate, pgvector self-hosted), (5) GPU hours for training or fine-tuning runs (with the cloud provider and instance type named), (6) eval delivery acceptance (the gating milestone for project sign-off). Add deposit credit as a negative line if a deposit was collected. Per [Plutio's 2026 invoice payment terms guide](https://www.plutio.com/freelancer-magazine/invoice-payment-terms), standard deposit is 50 percent for new clients and 25 percent for established. The companion rate research that justifies the hourly numbers is in [AI engineer freelance rates 2026](/blog/ai-engineer-freelance-rates-2026).

How do I bill clients for OpenAI API or compute costs?

Two patterns work in 2026: pass-through with markup (most common) or pass-through at cost (mature-client preference). Pass-through with markup adds 10-20 percent to the actual provider cost as a line item to compensate for the cash-flow burden of fronting the API spend on your own card. Pass-through at cost lists the actual provider charges with screenshots from the OpenAI / Anthropic / cloud dashboard attached as backup, and the client pays you back the exact amount. Per [Finout's 2026 FinOps for AI guide](https://www.finout.io/blog/finops-in-the-age-of-ai-a-cpos-guide-to-llm-workflows-rag-ai-agents-and-agentic-systems), mature AI buyers expect cost reports that decompose into LLM inference cost, embedding cost, and vector DB cost as separate lines because each can balloon independently. Mirror that decomposition on your invoice. Reference the provider rate at [OpenAI API pricing](https://openai.com/api/pricing/) directly so the client can audit. The cleanest setup is to put the client on their own provider account and bill only your engineering work; this eliminates pass-through entirely but requires the client to manage their own cost center.

Should I bill AI engineering by the hour or by milestone?

Hourly works for unscoped exploration and ongoing maintenance: prompt iteration, eval debugging, infrastructure hardening, on-call response. Milestone works for delivery work with definable acceptance criteria: RAG MVP, fine-tuning runs, agent system buildout, MLOps platform setup. Per [Plutio's 2026 invoice payment terms guide](https://www.plutio.com/freelancer-magazine/invoice-payment-terms), milestone billing is recommended for projects exceeding 10,000 dollars because it keeps cash flowing throughout the project instead of concentrating revenue at the end and limits exposure if a client disappears mid-project. The hybrid most senior AI engineers use in 2026 is fixed-bid with milestones for the planned scope plus an explicit hourly rate for change requests outside that scope. Per [Second Talent's 2026 freelance ML engineer hourly rate](https://www.secondtalent.com/resources/freelance-ml-engineer-hourly-rate-us/), the senior median hourly rate is 185 dollars, which sets the change-request anchor for a senior-level engagement.

What payment terms should AI engineers use in 2026?

Net 15 is the freelancer-preferred default in 2026. Per [Plutio's 2026 invoice payment terms research](https://www.plutio.com/freelancer-magazine/invoice-payment-terms), North American suppliers wait 43 days on average from invoice to payment, which means Net 30 invoices often clear around day 40 to 50 in practice; Net 15 cuts the wait window in half without creating friction with clients. Per [LedgerUp's 2026 Net 15 payment terms guide](https://www.ledgerup.ai/resources/net-15-payment-terms), Net 15 fits smaller projects, recurring invoices with established customers, and relationships where the seller has pricing leverage; Net 30 fits enterprise procurement workflows that rarely align with a 15-day window. Per [SolidGigs' 2026 freelance payment terms guide](https://solidgigs.com/blog/freelance-payment-terms-to-get-paid/), 1.5 percent per month late fee is the standard add-on. The practical playbook for AI engineering: quote Net 15 by default; accept Net 30 for enterprise clients who require it; collect a 25-50 percent non-refundable deposit before any work begins; add a 25 percent kill fee on remaining unpaid project value if the client terminates mid-engagement.

How do I invoice for RAG, fine-tuning, or agent projects?

Each AI project shape has a natural milestone structure. RAG MVP: split into 3 milestones tied to acceptance criteria (data ingestion + chunking pipeline accepted, retrieval relevance threshold passed, end-to-end eval pass at the agreed accuracy floor). Fine-tuning: split into 3 milestones (curated training set delivered, fine-tuned model meets eval threshold on held-out set, production deployment with monitoring live). AI agent system: split into 3-5 milestones (single-task agent with eval, multi-step orchestration with recovery handling, full agent with monitoring + cost guards live). Each milestone gets its own invoice with the milestone fee, any approved change orders as separate lines, and compute pass-through for that milestone's run costs. Per [Plutio's 2026 invoice payment terms guide](https://www.plutio.com/freelancer-magazine/invoice-payment-terms), a 33/33/34 split across 3 milestones is the standard pattern for projects in the 10,000-50,000 dollar range. Define each milestone as a binary deliverable that the client signs off on, not a calendar date; calendar-date milestones invite payment disputes when work slips for legitimate reasons.

AI Engineer Invoice Template 2026: Hourly, Project, Retainer, and Compute Pass-Through Billing

A 2026 freelance AI engineer invoice has a different shape than a generic freelance invoice. The line items have to handle compute pass-through (LLM inference cost, embedding cost, vector DB hosting, GPU hours), the milestones have to align with eval-acceptance gates rather than calendar dates, and the payment terms have to absorb the iteration risk that comes with production AI work. This post is the AI-engineering-specific template covering all three.

The general invoice basics live in how to write a freelance invoice. The hourly + project + retainer rate research that justifies the dollar amounts is in AI engineer freelance rates 2026. The companion proposal that locks deliverables and eval methodology before the invoice goes out is in AI engineer proposal that wins.

Why AI Engineer Invoices Are Different

A copywriter invoice has 1-2 lines. A web developer invoice has 1-12 lines depending on milestones and pass-throughs. An AI engineer invoice often has 6-15 lines because compute pass-through is itself decomposed into multiple cost lines that mature buyers expect to see broken out.

Profession	Typical lines per invoice	Unique cost element
Copywriter	1-2	Word count and revisions
Web developer	1-12	Scope changes, hosting passthroughs, platform fees
AI engineer	6-15	LLM inference + embedding + vector DB + GPU hours + eval gates
Videographer	6+	Day rate, kit, crew, post, license

The decomposition matters because of how mature AI buyers think about cost. Per Finout's 2026 FinOps for AI guide, the recommended FinOps practice is to "break down the cost of the RAG pipeline into its parts. For example, in our cost reports we separate 'LLM inference cost', 'Embedding cost', and 'Vector DB cost'." When your invoice mirrors that decomposition, it slots directly into the client's cost-tracking workflow. When your invoice rolls everything into a single "compute" line, it triggers procurement questions and slows payment.

The 4 Billing Models for AI Engineering in 2026

There are four production billing models. Each has a use case and a specific invoice format.

Model	Use case	Invoice format
Hourly	Exploration, eval debugging, on-call	Line per task with hours and rate
Fixed-bid project	Locked-scope deliverable (e.g., RAG MVP)	One project line + change-order lines + compute pass-through
Milestone billing	Multi-stage delivery work over $10K	One invoice per milestone with milestone fee + per-milestone compute
Retainer (monthly)	Ongoing dev + maintenance + on-call	Fixed monthly retainer line + overage hours line

Most senior AI engineering engagements in 2026 are hybrids: fixed-bid milestones for the planned scope plus an explicit hourly rate for change requests outside the bid plus pass-through compute on top. Per Second Talent's 2026 freelance ML engineer hourly rate, the senior median hourly anchor is 185 dollars and specialists in distributed training and MLOps charge 275-450 dollars; per Second Talent's 2026 freelance LLM developer hourly rate, LLM specialists earn 75-700 dollars with senior median 210 dollars and fine-tuning/RLHF specialists 350-700 dollars. Set your change-request hourly rate at the same number you would charge for a standalone hourly engagement, not below.

Sample Hourly Invoice (RAG Eval Debugging)

The hourly format works for exploration, eval debugging, prompt iteration, and incident response. Each task gets its own line.

Description	Hours	Rate	Amount
Eval failure diagnosis (citation-grounding regression)	4.5	$185	$832.50
Retrieval pipeline tuning (chunk size + reranker config)	6.0	$185	$1,110
New eval set construction + baseline scoring	5.0	$185	$925
Production cutover monitoring	2.0	$185	$370
Subtotal (engineering)	-	-	$3,237.50
LLM inference cost pass-through (Anthropic Claude API, period dashboard attached)	-	-	$187.40
Embedding cost pass-through (Cohere embeddings, period dashboard attached)	-	-	$42.10
Total due	-	-	$3,467

Notes on the format:

Each engineering line names the actual deliverable, not "consulting" or "engineering work" generically. Per Plutio, vague line items are the number-one cause of payment disputes.
Compute pass-through gets its own subtotal block below the engineering subtotal so the client can see what is your time vs what is their cloud spend.
Attach the provider dashboard screenshots as PDF backup. Most disputes on pass-through happen because the client doesn't believe the cost figure; the screenshot ends the question.

Sample Milestone Invoice (RAG MVP, 33/33/34 Split)

For a $30,000 RAG MVP project split across three milestones, each milestone produces one invoice. The split is 33% / 33% / 34% per Plutio's 2026 invoice payment terms guide, which recommends 3-stage milestone splits for projects in the $10,000-$50,000 range. This is the second of three invoices.

Description	Amount
Milestone 2 fee - Retrieval relevance threshold passed (per Section 4.2 of MSA)	$9,900
Approved change order CR-002: legal-doc OCR pipeline added	$2,400
LLM inference cost pass-through (training + eval runs, Anthropic dashboard attached)	$612
Embedding cost pass-through (initial corpus + delta indexing)	$148
Vector DB hosting (Pinecone, March 2026, dashboard attached)	$89
GPU hours pass-through (eval batch runs, Modal dashboard attached)	$76
Subtotal	$13,225
Less: Deposit credit (25% of total project value applied to milestones 1+2)	-$3,500
Total due	$9,725
Payment terms: Net 15. Late fee 1.5% per month after due date.	-

Notes on the format:

The milestone fee references the contract section number that defines the deliverable. This makes the invoice self-documenting against the MSA and reduces "what did I sign up for again?" disputes.
Approved change orders get their own line with the CR number. Never bury a change order inside the milestone fee; it triggers procurement audits.
Each compute pass-through line names the provider AND notes that the dashboard is attached. Mature buyers will not reimburse pass-through without backup.
Deposit credit is applied as a negative line. The cleanest pattern is to apply 50% of the deposit to milestone 1 and 50% to milestone 2 (so milestone 3 is paid in full with no deposit credit and the client gets the final delivery before the final dollar moves).

Sample Retainer Invoice (Standard AI Maintenance + Dev)

Retainers became standard in 2026 for production AI work where the engineer is on-call for incidents and continuous improvement. The retainer covers a defined hour bank; overages bill at the standard hourly rate.

Description	Amount
Monthly retainer - Standard tier (40 hours included, $9,000)	$9,000
Overage hours (3 hours @ $185/hr)	$555
LLM inference cost pass-through (March 2026 production usage, dashboard attached)	$1,247
Embedding cost pass-through (continuous indexing for new docs)	$214
Vector DB hosting (Pinecone, March 2026)	$89
Total due	$11,105
Payment terms: Net 15. Auto-renews monthly unless cancelled by 15th of prior month.	-

Notes:

Retainer fee is constant month over month. Overages are the only variable on the engineering side.
For specialist retainers (LLM/MLOps/safety), the same structure applies but the retainer fee scales: $15K-$40K/month for specialist on-retainer arrangements with safety, MLOps, or LLM evaluation expertise on-call for incident response.
Cancellation language matters. The "by 15th of prior month" clause prevents a client from cancelling on the 30th and avoiding the next month's invoice; this is the AI-engineering equivalent of the standard SaaS cancellation cutoff.

Compute Pass-Through Line Items (Reference)

Cost component	Typical billing unit	Where to source the dashboard for backup
LLM inference (input + output)	Per million tokens at provider rate	OpenAI, Anthropic, Google, AWS Bedrock dashboards
Embedding generation	Per million tokens	Cohere, OpenAI, Voyage AI dashboards
Vector DB storage + queries	Per pod-hour or per query	Pinecone, Weaviate Cloud, pgvector hosting
GPU hours (training/fine-tune)	Per GPU-hour	Modal, RunPod, Lambda Labs, AWS SageMaker, GCP
Cloud egress (model serving)	Per GB egress	AWS, GCP, Azure billing dashboards
Monitoring / observability	Per signal volume or event count	Langfuse, Helicone, Datadog AI integrations

For current per-token rates by model, link your client to OpenAI API pricing directly rather than quoting a number that will be stale within a quarter. Per Finout, there can be 30x-200x cost variance between an unoptimized AI deployment and a well-optimized one, so the per-token rate is the floor; the architecture decisions are where the real money is. That is one reason mature buyers value engineers who can show pass-through line items decomposed by component (it signals you understand where their money goes).

A useful framing from Finout: 80 percent or more of the true cost in agentic deployments lies in hidden areas (compliance, dev hours, maintenance) rather than direct cloud bills. Your invoice's engineering hours line is part of that 80 percent. Naming it explicitly (rather than letting the client compare your engineering hours line to a much smaller compute pass-through line and feel sticker shock) helps frame the value.

Deposit Schedule by Project Size

Per Plutio's 2026 invoice payment terms guide, deposit norms scale with project size and client trust.

Project size	New client deposit	Established client deposit	Notes
Under $2,000	50% (with Due on Receipt or Net 7 terms)	25%	Plutio explicit recommendation
$2,000-$10,000	50% with milestones	25% with milestones	Plutio: 25-50% range with milestone billing
Over $10,000	25% with milestones	25% with milestones	Plutio: 25% deposit recommended for large projects

For AI engineering specifically, the deposit also covers the cost of standing up your dev environment (provider accounts, eval infrastructure, monitoring tooling) before client revenue starts flowing. Without a deposit, you absorb the setup cost AND front the early compute spend on your card. Both add up fast on a non-trivial AI project.

Late Fee + Kill Fee Clauses

Two clauses that should appear on every AI engineering invoice or attached MSA:

Late fee. 1.5 percent per month on overdue balances per SolidGigs' 2026 freelance payment terms guide, or the maximum allowed by your local laws if higher. Activates on the day after the due date; compounds monthly. The clause exists less for revenue and more as a behavioral lever; clients pay your invoices first when there is a real cost to delay.
Kill fee. 25 percent of remaining unpaid project value due within 14 days of termination per Plutio's 2026 invoice payment terms guide. The kill fee compensates for lost opportunity cost and ramping-up time when a client cancels mid-engagement. For AI projects this matters more than typical software work because AI engagements often involve provider accounts, eval infrastructure, and tooling investments that don't carry over to the next client.

The practical phrasing for the kill-fee clause: "If Client terminates this agreement after work has begun, a kill fee of 25 percent of the remaining unpaid project value is due within 14 calendar days of termination."

What This Means for Sending Your Next AI Engineer Invoice

Three takeaways for an AI engineer about to send the next invoice:

Decompose pass-through into the components mature buyers expect. LLM inference, embedding, vector DB, GPU hours as separate lines. Mirror the FinOps cost-decomposition framework. Attach provider dashboards as backup.
Tie milestone fees to acceptance criteria, not calendar dates. Calendar-date milestones invite disputes when work legitimately slips. Acceptance-criteria milestones (eval threshold met, retrieval relevance passed, production cutover live) are auditable and defensible.
Build deposit + late fee + kill fee into the contract template once, then never argue about them again per project. The financial mechanisms that make the invoice enforceable belong in the MSA. The invoice references them; it doesn't relitigate them.

The companion rate research that justifies the hourly numbers is in AI engineer freelance rates 2026. The proposal format that locks scope before this invoice goes out is in AI engineer proposal that wins. The general invoice fundamentals are in how to write a freelance invoice. The retainer-vs-hourly framing for related consulting work is in consultant invoice retainer hourly value. The web-development comparison invoice is in web developer invoice 2026. The deeper payment-terms playbook is in freelance payment terms and the late-paying-client playbook is in late-paying clients.

To send this invoice without rebuilding the line-item structure each time, use FreelanceDesk's invoice generator which preserves the compute pass-through structure as a saved template.

AI Engineer Invoice Template 2026: Hourly, Project, Retainer, and Compute Pass-Through Billing

Why AI Engineer Invoices Are Different

The 4 Billing Models for AI Engineering in 2026

Sample Hourly Invoice (RAG Eval Debugging)

Sample Milestone Invoice (RAG MVP, 33/33/34 Split)

Sample Retainer Invoice (Standard AI Maintenance + Dev)

Compute Pass-Through Line Items (Reference)

Deposit Schedule by Project Size

Late Fee + Kill Fee Clauses

What This Means for Sending Your Next AI Engineer Invoice

References

Frequently Asked Questions

Related Articles

AI Engineer Freelance Rates 2026: Specialization, Experience, Geography, and Platform Take-Home (Aggregated from 10 Sources)

AI Engineer Proposal That Wins (2026 Template): Sections, Eval Methodology, Compute Cost Forecast, 3-Tier Pricing

How to Write a Freelance Invoice That Gets You Paid Faster

AI Engineer Invoice Template 2026: Hourly, Project, Retainer, and Compute Pass-Through Billing

Why AI Engineer Invoices Are Different

The 4 Billing Models for AI Engineering in 2026

Sample Hourly Invoice (RAG Eval Debugging)

Sample Milestone Invoice (RAG MVP, 33/33/34 Split)

Sample Retainer Invoice (Standard AI Maintenance + Dev)

Compute Pass-Through Line Items (Reference)

Deposit Schedule by Project Size

Late Fee + Kill Fee Clauses

What This Means for Sending Your Next AI Engineer Invoice

References

Frequently Asked Questions

1What should be on a freelance AI engineer's invoice?

2How do I bill clients for OpenAI API or compute costs?

3Should I bill AI engineering by the hour or by milestone?

4What payment terms should AI engineers use in 2026?

5How do I invoice for RAG, fine-tuning, or agent projects?

Related Articles

AI Engineer Freelance Rates 2026: Specialization, Experience, Geography, and Platform Take-Home (Aggregated from 10 Sources)

AI Engineer Proposal That Wins (2026 Template): Sections, Eval Methodology, Compute Cost Forecast, 3-Tier Pricing

How to Write a Freelance Invoice That Gets You Paid Faster