json-to-pdf
JSON to PDF
I was recently tasked with something I wasn’t exactly thrilled about — generating a PDF from a variable-shaped JSON object.
At first, it felt like a chore. But once I started looking at the data structure, patterns emerged. And those patterns led to a surprisingly clean, flexible rendering system built on config-driven rules, type-based defaults, and optional overrides.
This write-up documents how I got from a basic loop to something modular and extensible — one small step at a time.
Part 1: Make it work
function convertToFomattedHTML(jsonData) {
let htmlStr = ``;
for (let [key, value] of Object.entries(jsonData)) {
console.log(typeof value, value);
if (typeof value === "string") {
htmlStr += totextSection([key, value]);
continue;
}
if (Array.isArray(value)) {
htmlStr += arrayToTableSection([key, value]);
continue;
}
if (typeof value === "object") {
//to be implemented later
}
}
console.log(htmlStr);
return htmlStr;
}
Even in this first stage we can already see the "config" structure start to emerge. It was pretty obvious that we may want to show only specific columns rather than a column for every field in the object, therefore I added this fieldMap argument which contains the fields/keys that we want to display.
function convertToFomattedHTML(jsonData, fieldMap) {
let htmlStr = ``;
for (let [key, value] of Object.entries(jsonData)) {
console.log(typeof value, value);
if (typeof value === "string") {
htmlStr += totextSection([key, value]);
continue;
}
if (Array.isArray(value)) {
const fields = fieldMap[key]; // <-- get custom field list based on the specific table
htmlStr += arrayToTableSection([key, value], fields);
continue;
}
if (typeof value === "object") {
}
}
console.log(htmlStr);
return htmlStr;
}
...and here are the render functions
function totextSection([sectionTitle, text]) {
return `
<h5>${sectionTitle}</h5>
<div>${text}</div>
`;
}
function arrayToTableSection([sectionTitle, items], fields = null) {
if (!items || items.length === 0) return "";
// If no field list, use all keys
const headersList = fields || Object.keys(items[0]);
const headers = headersList.map((k) => `<th>${k}</th>`).join("");
const rows = items
.map(
(item) => `
<tr>
${headersList.map((k) => `<td>${item[k] ?? ""}</td>`).join("")}
</tr>
`
)
.join("");
return `
<h5>${sectionTitle}</h5>
<table>
<thead><tr>${headers}</tr></thead>
<tbody>${rows}</tbody>
</table>
`;
}
🧩 Part 2: Add a Config Object for Section Metadata
At this point, we’ve added variable column support using the fieldMap
argument — a quick way to control which fields show up for array sections.
But then a familiar thought crept in:
“What if I also want to customize the title shown for each section?”
Maybe the JSON key is ev_loc_info
, but we want to display "Event Information"
in the PDF. Or maybe we want to control the order sections appear in, or the column layout for just a few of them.
Rather than handling all of that inline, this is a good time to introduce a config object.
We call it pdfRenderConfig
, and it lets us decouple rendering behavior from data structure.
We don’t need to define entries for every key — just the ones where we want to override defaults.
So far, the config supports:
- A custom
title
(shown in the PDF) - A numeric
order
(to control display order) - A list of
fields
(for array-based sections)
Here's an example:
const pdfRenderConfig = {
ev_loc_info: {
title: "Event Information",
order: 1,
},
participants: {
title: "List of Participants",
order: 2,
fields: ["name", "role", "status"],
},
};
✨ Updated Render Logic
function convertToFormattedHTML(jsonData, config = {}) {
const sorted = Object.entries(jsonData).sort(([a], [b]) => {
return (config[a]?.order ?? 999) - (config[b]?.order ?? 999);
});
let htmlStr = "";
for (let [key, value] of sorted) {
const cfg = config[key] || {};
const title = cfg.title || key;
const fields = cfg.fields || null;
if (typeof value === "string") {
htmlStr += totextSection([title, value]);
} else if (Array.isArray(value)) {
htmlStr += arrayToTableSection([title, value], fields);
} else if (typeof value === "object" && value !== null) {
htmlStr += totextSection([title, JSON.stringify(value)]);
}
}
return htmlStr;
}
🧱 Part 3: Override Rendering per Object
So far, we’ve built a solid base:
convertToFormattedHTML()
loops through each section- We infer rendering based on data type (
string
,array
,object
) - The config gives us control over section titles, order, and table fields
But… what happens when a section just doesn't look right?
Maybe the formatting needs to be tighter.
Maybe you want to inject custom markup or inline styles.
Maybe it's just plain different.
💡 The Solution: renderFn
We allow any section to define its own renderFn
inside the config:
rawJsonDump: {
title: "Debug Info",
order: 99,
renderFn: (val, title) => `
<h5>${title}</h5>
<pre style="font-size: 10px; white-space: pre-wrap;">
${JSON.stringify(val, null, 2)}
</pre>
`,
},
⚙️ Updated Render Loop
In our loop, we check for and use renderFn
if it exists:
for (let [key, value] of sorted) {
const cfg = config[key] || {};
const title = cfg.title || key;
const fields = cfg.fields || null;
const renderFn = cfg.renderFn;
if (typeof renderFn === "function") {
htmlStr += renderFn(value, title);
continue;
}
// fallback to type-based rendering
...
}
✅ Benefits
- We still keep default behavior for 90% of sections
- We avoid cramming weird layout logic into our type-based functions
- We stay flexible without sacrificing structure
Absolutely — this is the perfect place to introduce the idea of shared custom renderers, especially for devs who might otherwise just inline everything into the config. It's a subtle shift from config-as-data to config-as-behavior, which unlocks cleaner reuse and separation of concerns.
That’s even better — lead with the obvious, show it working in place, and then suggest the cleaner refactor once the reader has context. It makes it feel approachable without preaching structure too early.
Here’s a version that flows in that order:
🔁 Reusable Render Functions (Addendum to Part 3)
Ok so now we have renderFn
s defined inside our config:
const pdfConfig = {
rawJsonDump: {
title: "Raw JSON Dump",
order: 99,
renderFn: (val, title) => `
<h5>${title}</h5>
<pre style="font-size: 10px; white-space: pre-wrap;">
${JSON.stringify(val, null, 2)}
</pre>
`,
},
};
This works great for one-off custom layouts. But if you find yourself repeating the same renderer logic across multiple sections — or if the function starts getting long — you can always pull it out:
function renderJsonAsPre(val, title) {
return `
<h5>${title}</h5>
<pre style="font-size: 10px; white-space: pre-wrap;">
${JSON.stringify(val, null, 2)}
</pre>
`;
}
Then reference it:
const pdfConfig = {
rawJsonDump: {
title: "Raw JSON Dump",
order: 99,
renderFn: renderJsonAsPre,
},
};
If you're using a bunch of these, you could eventually move them to a separate file (like renderers.js
) and import them — but there's no rush.
This pattern is just a nice reminder:
📦 Config doesn’t have to be static — it can include logic.
Part 4: Handling The Legacy Format Known as Sheets of Paper
Because yes, the humble PDF is really just our best effort at making pixels behave like dead trees.
🧱 Part 4A: Prevent Sections from Being Cut in Half (Page Break Wrapping)
At this point, our PDF generator renders each section with proper titles, ordering, and optional custom renderers.
But there's one last annoyance: PDF page breaks don't always care about your content layout.
You might end up with a heading on one page and its content on the next — or worse, a mid-word cut right through the middle of a sentence 😬
🎯 The Goal
We want each section to be treated as a single printable unit, so that:
- Page breaks can happen between sections
- But not inside a section (unless it's really long — more on that in a sec)
✅ The Fix: Wrap Each Section in a Container
In convertToFormattedHTML()
, we now wrap each section like this:
htmlStr += `<div class="pdf-section">${sectionHtml}</div>`;
So your loop becomes:
for (let [key, value] of sorted) {
const cfg = config[key] || {};
const title = cfg.title || key;
const fields = cfg.fields || null;
const renderFn = cfg.renderFn;
let sectionHtml = "";
if (typeof renderFn === "function") {
sectionHtml = renderFn(value, title);
} else if (typeof value === "string") {
sectionHtml = totextSection([title, value]);
} else if (Array.isArray(value)) {
sectionHtml = arrayToTableSection([title, value], fields);
} else if (typeof value === "object" && value !== null) {
sectionHtml = totextSection([title, JSON.stringify(value, null, 2)]);
}
htmlStr += `<div class="pdf-section">${sectionHtml}</div>`;
}
🖼️ Add Some CSS
.pdf-section {
page-break-inside: avoid;
break-inside: avoid;
margin-bottom: 1.5rem;
}
This tells the PDF renderer (browser or html2pdf
) to try really hard to keep the section together.
📝 Note: if a single section is taller than the full page, it may still get cut. We will cover that in the next part!
🧼 Summary
Wrapping each section in a div.pdf-section
:
- Keeps headings and content together
- Avoids ugly mid-element page breaks
- Makes future styling/spacing easier too
🧨 Part 4B: When a Section is Just Too Damn Big
(aka "Handling Oversized Sections That Spill Off the Page")
Even with page-break-inside: avoid
, there’s one class of problems we still have to deal with:
What if a single section — like a long table or big blob of text — is too tall to fit on one page?
That’s where your nicely-wrapped .pdf-section
becomes a double-edged sword.
🔥 The Problem
page-break-inside: avoid
tells the PDF renderer: “keep this whole thing together”- But if the section physically can’t fit, it may get:
- Cut off mid-word (bad)
- Overflow the page (worse)
- Or cause rendering glitches (depends on the tool)
✅ The Strategy: Detect & Relax the Break Rule
We can measure the rendered height of each .pdf-section
and remove the break-inside: avoid
rule only for sections that are too tall.
🛠️ Implementation
1. Wrap all sections with .pdf-section
(you already do this)
2. Add a marker class to allow page breaks:
.pdf-section.allow-breaks {
break-inside: auto;
page-break-inside: auto;
}
3. Use JS to measure and classify
Once the DOM is rendered but before generating the PDF:
const MAX_PAGE_HEIGHT = 1000; // or whatever your PDF page height is in px
document.querySelectorAll('.pdf-section').forEach((section) => {
const height = section.offsetHeight;
if (height > MAX_PAGE_HEIGHT) {
section.classList.add('allow-breaks');
}
});
We can fine-tune MAX_PAGE_HEIGHT
depending on:
- Page size (A4, Letter, etc.)
- Margins
- Font sizes, etc.
🧠 Optional Upgrade: Use getBoundingClientRect()
If you want more precision or account for scaling:
const { height } = section.getBoundingClientRect();
🧼 Result
We get the best of both worlds:
- Small sections stay glued together
- Big ones are allowed to break safely across multiple pages
- We avoid the horror of mid-word or mid-line splits