Building for large systems and long-running background jobs.Credit: Ilias Chebbi on Unsplash Months ago, I assumed the role that required building infrastBuilding for large systems and long-running background jobs.Credit: Ilias Chebbi on Unsplash Months ago, I assumed the role that required building infrast

Building Spotify for Sermons.

2025/12/11 21:15

Building for large systems and long-running background jobs.

Credit: Ilias Chebbi on Unsplash

Months ago, I assumed the role that required building infrastructure for media(audio) streaming. But beyond serving audio as streamable chunks, there were long-running media processing jobs and an extensive RAG pipeline that catered to transcription, transcoding, embedding, and sequential media updates. Building an MVP with a production mindset had us reiterate till we achieved a seamless system. Our approach has been one where we integrated features and the underlying stack of priorities.

Of Primary concern:

Over the course of building, each iteration came as a response to immediate and often “encompassing” need. Initial concern was queuing jobs, which readily sufficed with Redis; we simply fired and forgot. Bull MQ in the NEST JS framework gave us an even better control over retries, backlogs, and the dead-letter queue. Locally and with a few payloads in production, we got the media flow right. We were soon burdened by the weight of Observability:
Logs → Record of jobs (requests, responses, errors).
Metrics → How much / how often these jobs run, fail, complete, etc.
Traces → The path a job took across services (functions/methods called within the flow path).

You can solve some of these by designing APIs and building a custom dashboard to plug them into, but the problem of scalability will suffice. And in fact, we did design the APIs.

Building for Observability

The challenge of managing complex, long-running backend workflows, where failures must be recoverable, and state must be durable, Inngest became our architectural salvation. It fundamentally reframed our approach: each long-running background job becomes a background function, triggered by a specific event.

For instance, an Transcription.request event will trigger a TranscribeAudio function. This function might contain step-runs for: fetch_audio_metadata, deepgram_transcribe, parse_save_trasncription, and notify_user.

Deconstructing the Workflow: The Inngest Function and Step-runs

The core durability primitive is the step-runs. A background function is internally broken down into these step-runs, each containing a minimal, atomic block of logic.

  • Atomic Logic: A function executes your business logic step by step. If a step fails, the state of the entire run is preserved, and the run can be retried. This restarts the function from the beginning. Individual steps or step-runs cannot be retried in isolation.
  • Response Serialization: A step-run is defined by its response. This response is automatically serialized, which is essential for preserving complex or strongly-typed data structures across execution boundaries. Subsequent step-runs can reliably parse this serialized response, or logic can be merged into a single step for efficiency.
  • Decoupling and Scheduling: Within a function, we can conditionally queue or schedule new, dependent events, enabling complex fan-out/fan-in patterns and long-term scheduling up to a year. Errors and successes at any point can be caught, branched, and handled further down the workflow.

Inngest function abstract:

import { inngest } from 'inngest-client';

export const createMyFunction = (dependencies) => {
return inngest.createFunction(
{
id: 'my-function',
name: 'My Example Function',
retries: 3, // retry the entire run on failure
concurrency: { limit: 5 },
onFailure: async ({ event, error, step }) => {
// handle errors here
await step.run('handle-error', async () => {
console.error('Error processing event:', error);
});
},
},
{ event: 'my/event.triggered' },
async ({ event, step }) => {
const { payload } = event.data;

// Step 1: Define first step
const step1Result = await step.run('step-1', async () => {
// logic for step 1
return `Processed ${payload}`;
});

// Step 2: Define second step
const step2Result = await step.run('step-2', async () => {
// logic for step 2
return step1Result + ' -> step 2';
});

// Step N: Continue as needed
await step.run('final-step', async () => {
// finalization logic
console.log('Finished processing:', step2Result);
});

return { success: true };
},
);
};

The event-driven model of Inngest provides granular insight into every workflow execution:

  • Comprehensive Event Tracing: Every queued function execution is logged against its originating event. This provides a clear, high-level trail of all activities related to a single user action.
  • Detailed Run Insights: For each function execution (both successes and failures), Inngest provides detailed logs via its ack (acknowledge) and nack (negative acknowledgment) reporting. These logs include error stack traces, full request payloads, and the serialized response payloads for every individual step-run.
  • Operational Metrics: Beyond logs, we gained critical metrics on function health, including success rates, failure rates, and retry count, allowing us to continuously monitor the reliability and latency of our distributed workflows.

Building for Resilience

The caveat to relying on pure event processing is that while Inngest efficiently queues function executions, the events themselves are not internally queued in a traditional messaging broker sense. This absence of an explicit event queue can be problematic in high-traffic scenarios due to potential race conditions or dropped events if the ingestion endpoint is overwhelmed.

To address this and enforce strict event durability, we implemented a dedicated queuing system as a buffer.

AWS Simple Queue System (SQS) was the system of choice (though any robust queuing system is doable), given our existing infrastructure on AWS. We architected a two-queue system: a Main Queue and a Dead Letter Queue (DLQ).

We established an Elastic Beanstalk (EB) Worker Environment specifically configured to consume messages directly from the Main Queue. If a message in the Main Queue fails to be processed by the EB Worker a set number of times, the Main Queue automatically moves the failed message to the dedicated DLQ. This ensures no event is lost permanently if it fails to trigger or be picked up by Inngest. This worker environment differs from a standard EB web server environment, as its sole responsibility is message consumption and processing (in this case, forwarding the consumed message to the Inngest API endpoint).

UNDERSTANDING LIMITS AND SPECIFICATIONS

An understated and rather pertinent part of building enterprise-scale infrastructure is that it consumes resources, and they are long-running. Microservices architecture provides scalability per service. Storage, RAM, and timeouts of resources will come into play. Our specification for AWS instance type, for example, moved quickly from t3.micro to t3.small, and is now pegged at t3.medium. For long-running, CPU-intensive background jobs, horizontal scaling with tiny instances fails because the bottleneck is the time it takes to process a single job, not the volume of new jobs entering the queue.

Jobs or functions like transcoding, embedding are typically CPU-bound and Memory-bound. CPU-bound because they require sustained, intense CPU usage, and Memory-Bound because they often require substantial RAM to load large models or handle large files or payloads efficiently.

Ultimately, this augmented architecture, placing the durability of SQS and the controlled execution of an EB Worker environment directly upstream of the Inngest API, provided essential resiliency. We achieved strict event ownership, eliminated race conditions during traffic spikes, and gained a non-volatile dead letter mechanism. We leveraged Inngest for its workflow orchestration and debugging capabilities, while relying on AWS primitives for maximum message throughput and durability. The resulting system is not only scalable but highly auditable, successfully translating complex, long-running backend jobs into secure, observable, and failure-tolerant micro-steps.


Building Spotify for Sermons. was originally published in Coinmonks on Medium, where people are continuing the conversation by highlighting and responding to this story.

Sorumluluk Reddi: Bu sitede yeniden yayınlanan makaleler, halka açık platformlardan alınmıştır ve yalnızca bilgilendirme amaçlıdır. MEXC'nin görüşlerini yansıtmayabilir. Tüm hakları telif sahiplerine aittir. Herhangi bir içeriğin üçüncü taraf haklarını ihlal ettiğini düşünüyorsanız, kaldırılması için lütfen service@support.mexc.com ile iletişime geçin. MEXC, içeriğin doğruluğu, eksiksizliği veya güncelliği konusunda hiçbir garanti vermez ve sağlanan bilgilere dayalı olarak alınan herhangi bir eylemden sorumlu değildir. İçerik, finansal, yasal veya diğer profesyonel tavsiye niteliğinde değildir ve MEXC tarafından bir tavsiye veya onay olarak değerlendirilmemelidir.

Ayrıca Şunları da Beğenebilirsiniz

What We Know (and Don’t) About Modern Code Reviews

What We Know (and Don’t) About Modern Code Reviews

This article traces the evolution of modern code review from formal inspections to tool-driven workflows, maps key research themes, and highlights a critical gap
Paylaş
Hackernoon2025/12/17 17:00
X claims the right to share your private AI chats with everyone under new rules – no opt out

X claims the right to share your private AI chats with everyone under new rules – no opt out

X says its Terms of Service will change Jan. 15, 2026, expanding how the platform defines user “Content” and adding contract language tied to the operation and
Paylaş
CryptoSlate2025/12/17 19:24
Michael Saylor Pushes Digital Capital Narrative At Bitcoin Treasuries Unconference

Michael Saylor Pushes Digital Capital Narrative At Bitcoin Treasuries Unconference

The post Michael Saylor Pushes Digital Capital Narrative At Bitcoin Treasuries Unconference appeared on BitcoinEthereumNews.com. The suitcoiners are in town.  From a low-key, circular podium in the middle of a lavish New York City event hall, Strategy executive chairman Michael Saylor took the mic and opened the Bitcoin Treasuries Unconference event. He joked awkwardly about the orange ties, dresses, caps and other merch to the (mostly male) audience of who’s-who in the bitcoin treasury company world.  Once he got onto the regular beat, it was much of the same: calm and relaxed, speaking freely and with confidence, his keynote was heavy on the metaphors and larger historical stories. Treasury companies are like Rockefeller’s Standard Oil in its early years, Michael Saylor said: We’ve just discovered crude oil and now we’re making sense of the myriad ways in which we can use it — the automobile revolution and jet fuel is still well ahead of us.  Established, trillion-dollar companies not using AI because of “security concerns” make them slow and stupid — just like companies and individuals rejecting digital assets now make them poor and weak.  “I’d like to think that we understood our business five years ago; we didn’t.”  We went from a defensive investment into bitcoin, Saylor said, to opportunistic, to strategic, and finally transformational; “only then did we realize that we were different.” Michael Saylor: You Come Into My Financial History House?! Jokes aside, Michael Saylor is very welcome to the warm waters of our financial past. He acquitted himself honorably by invoking the British Consol — though mispronouncing it, and misdating it to the 1780s; Pelham’s consolidation of debts happened in the 1750s and perpetual government debt existed well before then — and comparing it to the gold standard and the future of bitcoin. He’s right that Strategy’s STRC product in many ways imitates the consols; irredeemable, perpetual debt, issued at par, with…
Paylaş
BitcoinEthereumNews2025/09/18 02:12