The starting point
We host business websites on Cloudflare Pages. All of them deploy with git push to Cloudflare's global CDN.
Static hosting is the easy part. The problems started when the sites went live and real traffic showed up.
Problem: Contact forms attracted spam within days
Every business site needs a contact form. Within days of launching a realtor client's website, the owner's inbox was flooded with spam submissions — automated bots filling in every field with junk. The form worked perfectly; it just worked for bots too.
Solution: Turnstile for bot protection, honeypots for the rest
We added Cloudflare Turnstile — a CAPTCHA alternative that verifies visitors are human without making them solve puzzles. Unlike reCAPTCHA, it doesn't track users across sites or require a Google account. On the server side, every Turnstile token is verified before any business logic runs, including the visitor's IP via the CF-Connecting-IP header so Cloudflare can detect token reuse from different addresses.
For sites where Turnstile felt like overkill — a seafood store's simple inquiry form, for example — we added a secondary bot filtering layer that silently rejects automated submissions without affecting real visitors. The inbox stays clean.
The modal rendering problem
Several of our websites include an AI live chat feature as a sidebar widget. The chat panel starts hidden and opens on demand. We initially loaded Turnstile with automatic rendering — and it silently broke. The challenge iframe loaded inside the invisible container but couldn't measure its viewport, causing it to fail without any error message. Users would open the chat and see a blank verification box.
The fix was switching to explicit render mode: the Turnstile widget only initializes when the chat panel actually opens and is visible in the DOM. A small detail, but one that cost us debugging time because the failure was completely silent — no console errors, no network failures, just a widget that never appeared.
Problem: Static sites can't process anything
A static site serves files. It can't validate a form, send an email, call an API, or store data. The moment a business needs any of that, you need a server — or something that acts like one.
Traditional options — a VPS, a container, a managed backend — all require provisioning, maintenance, and monthly costs. For a restaurant or a retail store that just needs a working contact form, that's disproportionate.
Solution: Pages Functions as a lightweight backend
Cloudflare Pages Functions are serverless functions that deploy alongside your static files. Drop a JavaScript file in a /functions directory, push to Git, and it's a live API endpoint running on the same global edge network as your site. No Docker, no CI/CD, no server configuration.
Several of our business sites use Functions — restaurants integrating with reservation platforms, stores handling quote requests, realtors routing leads by listing type, AI products managing chat sessions. Each site gets exactly the backend logic it needs.
A typical function validates input, verifies Turnstile, sanitizes data, and delegates to an external service — an email API, a booking platform, or an AI model. A single JavaScript file, no framework needed.
AI-powered lead capture without a server
The AI live chat sidebar we described above does more than just verify visitors — it replaces the traditional contact form entirely. Instead of filling out static fields, visitors interact with an AI assistant that collects their information through natural conversation — name, email, what they're looking for, and any relevant details.
When the AI has gathered enough context, it outputs a structured signal. The frontend detects it and calls a second function that sends two emails in parallel — one to the business owner, one to the customer as confirmation. A few Pages Functions, zero servers, and the entire lead capture flow runs on Cloudflare's edge.
Problem: Some backends don't fit in a serverless function
Pages Functions work well for short-lived request-response cycles — form validation, API calls, data logging. But some of our sites run full applications that need persistent connections, long-running processes, and access to multiple databases. That doesn't fit inside a function with a 30-second execution limit.
Solution: Pages Functions as a reverse proxy
The frontend lives on Cloudflare Pages. The heavy backend runs on a separate cloud platform. A single Pages Function bridges them: it catches every /api/* request, authenticates the connection, and forwards it to the backend server. Cloudflare handles SSL, DDoS protection, and caching for static assets. The cloud platform handles the heavy processing.
Browser
|
+-- Static assets --> Cloudflare Pages CDN (edge-cached)
|
+-- /api/* requests --> Pages Function (reverse proxy)
|
+-- Attach secret header
+-- Forward to backend server
+-- Stream response back
This pattern — Cloudflare for the edge, a separate platform for heavy compute — lets us use the best tool for each job without locking the entire stack into one provider. The frontend gets global CDN delivery and Cloudflare's security layer. The backend gets the runtime flexibility it needs.
Problem: We had no idea who was visiting
After launching several client sites, we realized we were operating blind. Google Analytics tells you page views and bounce rates, but nothing about the individual visitors interacting with your site. When someone submitted a contact form or browsed a service page, we had no context — where they were, how they found us, or whether they'd visited before. Every lead got the same generic follow-up.
Solution: Cloudflare's request.cf object + D1
Every HTTP request through Cloudflare carries metadata that most platforms discard. Cloudflare exposes it via the request.cf object — geo data derived from the connecting IP, populated at the edge before your code runs. No third-party lookups, no API calls, no added latency.
We started logging aggregated visitor interaction data across our business sites into a Cloudflare D1 database:
| Field | Source | What it told us |
|---|---|---|
country, region, city |
request.cf | Shows where visitors are actually coming from — not just the country, but the specific city and province |
postal_code |
request.cf | Regional interest patterns — shows which areas generate the most engagement for local businesses |
timezone |
request.cf | Helps time follow-ups appropriately — a visitor in a different time zone should get a response during their business hours, not ours |
network type |
request.cf | Broad network category — helps distinguish organic traffic patterns from automated sources |
lang |
Accept-Language header | Significant Japanese-language traffic told us to prioritize Japanese content for certain clients |
referer |
Referer header | Which channels actually drive engaged visitors — Google Search, LinkedIn, or direct? |
ua, protocol, tls |
Headers / request.cf | Device and connection profile — most visitors were mobile, which changed our design priorities |
The logging runs asynchronously via Cloudflare's waitUntil() API — the D1 insert happens after the response is already sent, so it never adds latency to the user's experience. If the insert fails, the error is logged but the request continues normally.
What changed after we had the data
The data shifted several assumptions:
- SEO targeting. Geo data revealed that traffic wasn't coming from where we expected. We adjusted keyword strategies across client sites to match where interest actually existed, rather than where we assumed it would.
- Language priorities.
Accept-Languagedata for one client showed enough Japanese-language traffic to justify adding Japanese content — a decision based on real data instead of guessing. - Lead prioritization. Aggregated visit patterns help distinguish engaged visitors from casual browsers — without asking anyone to fill out qualification forms.
- Referer analysis. For one client, Instagram referrals had a higher conversion rate than Google Search. That changed where we recommended they spend their marketing budget.
D1 as a zero-cost analytics backend
Cloudflare D1 is a SQLite database at the edge. We store visitor data there and query it as needed — optimized for the lookups we actually run. For sites doing hundreds of interactions per week, we're nowhere near the limits. No separate analytics platform, no data pipeline, no monthly SaaS fee.
All visitor data is aggregated and anonymized. We don't use third-party trackers or sell data. Each client site includes a privacy policy that discloses what is collected and how it's used, consistent with Canadian privacy regulations.
Problem: Forms and APIs were exposed to injection attacks
When you accept user input and put it into an email template, every field is a potential injection vector. An attacker who submits <script>alert('xss')</script> as their name could execute JavaScript in your email client. A newline character in an email subject can inject additional headers. An oversized payload can exhaust your function's memory.
These aren't theoretical risks. They're standard attack vectors that hit any form on the public internet.
Solution: Defense at every layer
We built validation into every Pages Function independently — no shared middleware, no assumption that the frontend already checked:
- Required fields checked for presence and type
- Input lengths capped per field
- Email format validated server-side, not just in the browser
- All user input HTML-escaped before inclusion in email templates
- Email subject lines stripped of
\r\nto prevent header injection - CORS origin checked against an explicit allowlist — unauthorized origins get a 403
- Request body size capped before parsing
Beyond the function code, every site ships with a _headers file that Cloudflare Pages applies to all responses:
- Content-Security-Policy — allowlists exactly which domains can serve scripts, styles, and frames. Each site's CSP is tailored: one site permits
challenges.cloudflare.comfor Turnstile; another permitsmaps.googleapis.comfor embedded maps. Nothing else. - Strict-Transport-Security — forces HTTPS for one year with HSTS preload. Once a browser sees this header, it will never attempt an insecure connection.
- X-Frame-Options: DENY — blocks clickjacking by preventing the site from being embedded in iframes.
- Permissions-Policy — explicitly disables camera, microphone, and geolocation. A restaurant website has no business accessing your camera.
What Cloudflare gives you for SEO — without extra work
We noticed that client sites deployed on Cloudflare consistently scored higher on Google's technical SEO signals than sites we'd previously deployed on other platforms. The infrastructure handles several ranking factors automatically:
- TTFB under 50ms globally. Cloudflare's CDN serves static files from 300+ edge locations — we don't manage this infrastructure, it's built into the platform. Google's Core Web Vitals benefit directly from this edge delivery.
- Automatic HTTPS with certificate renewal. Cloudflare handles certificate provisioning and renewal. A Google ranking signal since 2014. HSTS preload means browsers never even attempt an insecure connection.
- HTTP/2 and HTTP/3 by default. Multiplexed connections, header compression, and 0-RTT handshakes — all enabled by Cloudflare without configuration on our part.
- No single point of failure. Cloudflare's distributed CDN keeps sites available. Sites that are always up get crawled more frequently.
On top of the infrastructure, we add the standard SEO tooling manually: XML sitemaps with accurate lastmod dates, robots.txt that excludes /api/ routes from crawling, JSON-LD structured data for articles and breadcrumbs, Open Graph and Twitter Card meta tags, and canonical URLs.
The visitor data from D1 closes the loop. Instead of guessing which keywords to target or which languages to support, we can see where actual visitors are, what language they prefer, and which referral channels drive real engagement — then optimize content to match.
Each problem we hit — spam, missing visitor data, injection vulnerabilities, backends that don't fit serverless — had a solution already inside Cloudflare's platform. The result is a stack where all of our business websites run on the same infrastructure, each using exactly the layers it needs.