fix: Implement distribute locker in Redis for cron jobs
Some checks failed
Deploy to Staging / Build Images (push) Failing after 30s
Deploy to Staging / Deploy to Staging (push) Has been skipped
Deploy to Staging / Verify Staging (push) Has been skipped
Deploy to Staging / Notify Staging Ready (push) Has been skipped
Deploy to Staging / Notify Staging Failure (push) Successful in 6s

This commit is contained in:
Eric Gullickson
2026-01-01 11:02:54 -06:00
parent ffd8ecd1d0
commit d8ea0c7297
6 changed files with 271 additions and 6 deletions

View File

@@ -0,0 +1,92 @@
# Scheduler Module
Centralized cron job scheduler using `node-cron` for background tasks.
## Overview
The scheduler runs periodic background jobs. In blue-green deployments, **multiple backend containers may run simultaneously**, so all jobs MUST use distributed locking to prevent duplicate execution.
## Registered Jobs
| Job | Schedule | Description |
|-----|----------|-------------|
| Notification processing | 8 AM daily | Process scheduled notifications |
| Account purge | 2 AM daily | GDPR compliance - purge deleted accounts |
| Backup check | Every minute | Check for due scheduled backups |
| Retention cleanup | 4 AM daily | Clean up old backups (also runs after each backup) |
## Distributed Locking Requirement
**All scheduled jobs MUST use the `lockService`** from `core/config/redis.ts` to prevent duplicate execution when multiple containers are running.
### Pattern for New Jobs
```typescript
import { v4 as uuidv4 } from 'uuid';
import { lockService } from '../../core/config/redis';
import { logger } from '../../core/logging/logger';
export async function processMyJob(): Promise<void> {
const lockKey = 'job:my-job-name';
const lockValue = uuidv4();
const lockTtlSeconds = 300; // 5 minutes - adjust based on expected job duration
// Try to acquire lock
const acquired = await lockService.acquireLock(lockKey, lockTtlSeconds, lockValue);
if (!acquired) {
logger.debug('Job already running in another container, skipping');
return;
}
try {
logger.info('Starting my job');
// Do work...
logger.info('My job completed');
} catch (error) {
logger.error('My job failed', { error });
throw error;
} finally {
// Always release the lock
await lockService.releaseLock(lockKey, lockValue);
}
}
```
### Lock Key Conventions
Use descriptive, namespaced lock keys:
| Pattern | Example | Use Case |
|---------|---------|----------|
| `job:{name}` | `job:notification-processor` | Global jobs (run once) |
| `job:{name}:{id}` | `backup:schedule:uuid-here` | Per-entity jobs |
### Lock TTL Guidelines
Set TTL longer than the expected job duration, but short enough to recover from crashes:
| Job Duration | Recommended TTL |
|--------------|-----------------|
| < 10 seconds | 60 seconds |
| < 1 minute | 5 minutes |
| < 5 minutes | 15 minutes |
| Long-running | 30 minutes + heartbeat |
## Adding New Jobs
1. Create job file in the feature's `jobs/` directory
2. Implement distributed locking (see pattern above)
3. Register in `core/scheduler/index.ts`
4. Update this README with the new job
## Blue-Green Deployment Behavior
When both blue and green containers are running:
1. Both schedulers trigger at the same time
2. Both attempt to acquire the lock
3. Only one succeeds (atomic Redis operation)
4. The other skips the job execution
5. Lock is released when job completes
This ensures exactly-once execution regardless of how many containers are running.