Title: Feature: ETag Strategy Alignment with Web Best Practices · Issue #101 · JavaScriptSolidServer/JavaScriptSolidServer · GitHub
Open Graph Title: Feature: ETag Strategy Alignment with Web Best Practices · Issue #101 · JavaScriptSolidServer/JavaScriptSolidServer
X Title: Feature: ETag Strategy Alignment with Web Best Practices · Issue #101 · JavaScriptSolidServer/JavaScriptSolidServer
Description: Summary This issue proposes a comprehensive ETag strategy for JavaScriptSolidServer aligned with HTTP caching best practices. ETags are critical for efficient caching, bandwidth reduction, and preventing mid-air collisions during concurr...
Open Graph Description: Summary This issue proposes a comprehensive ETag strategy for JavaScriptSolidServer aligned with HTTP caching best practices. ETags are critical for efficient caching, bandwidth reduction, and prev...
X Description: Summary This issue proposes a comprehensive ETag strategy for JavaScriptSolidServer aligned with HTTP caching best practices. ETags are critical for efficient caching, bandwidth reduction, and prev...
Opengraph URL: https://github.com/JavaScriptSolidServer/JavaScriptSolidServer/issues/101
X: @github
Domain: github.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Feature: ETag Strategy Alignment with Web Best Practices","articleBody":"## Summary\n\nThis issue proposes a comprehensive ETag strategy for JavaScriptSolidServer aligned with HTTP caching best practices. ETags are critical for efficient caching, bandwidth reduction, and preventing mid-air collisions during concurrent edits.\n\n**Difficulty**: 35/100 \n**Estimated Effort**: 2-3 days \n**Dependencies**: None\n\n---\n\n## Current State Analysis\n\n### ETag Generation\n**File**: `src/storage/filesystem.js` line 32\n\n```javascript\netag: `\"${crypto.createHash('md5').update(stats.mtime.toISOString() + stats.size).digest('hex')}\"`\n```\n\n**Current approach**: MD5 hash of `mtime + size` (metadata-based)\n\n| Aspect | Current | Issue |\n|--------|---------|-------|\n| Algorithm | MD5 | Cryptographically weak (acceptable for ETags, but not ideal) |\n| Input | mtime + size | Not content-based, can miss changes if mtime preserved |\n| Type | Strong (no W/ prefix) | Claims byte-identical but isn't truly content-based |\n| Caching | Synchronous crypto | Blocks event loop on every stat() call |\n\n### Conditional Request Handling\n**File**: `src/utils/conditional.js` (154 lines)\n\n✅ **Well implemented**:\n- `If-Match` header for safe updates (412 on mismatch)\n- `If-None-Match` for GET/HEAD (304 Not Modified)\n- `If-None-Match` for PUT/POST (create-only with `*`)\n- Wildcard (`*`) support\n- Proper normalization (strips W/ prefix, quotes)\n\n### Cache-Control Headers\n**Current usage**:\n\n| Location | Value | Purpose |\n|----------|-------|---------|\n| `resource.js:225` | `no-store` | Mashlib HTML responses |\n| `resource.js:303` | `no-store` | Mashlib HTML responses |\n| `idp/index.js:202` | `public, max-age=3600` | JWKS endpoint |\n| `idp/index.js:209` | `public, max-age=3600` | OpenID configuration |\n| `idp/credentials.js:147` | `no-store` | Credentials endpoint |\n\n**Missing**: No `Cache-Control` on regular resource responses.\n\n### Last-Modified Header\n**Status**: ❌ Not implemented\n\n`mtime` is available from `stat()` but not exposed as `Last-Modified` header.\n\n---\n\n## Issues Identified\n\n### 1. Strong ETag Mismatch\n**Severity**: MEDIUM\n\nCurrent ETags are formatted as strong (`\"abc123\"`) but are generated from metadata, not content. Per RFC 7232:\n\n\u003e A strong validator is representation metadata that changes value whenever a change occurs to the representation data that would be observable in the payload body of a 200 (OK) response to GET.\n\n**Problem**: If a file is modified but `mtime` is preserved (e.g., `touch -m`), the ETag won't change even though content changed.\n\n### 2. Missing Content-Based ETags for Dynamic Content\n**Severity**: MEDIUM\n\nContent negotiation transforms stored content:\n- Turtle → JSON-LD conversion\n- JSON-LD → Turtle conversion\n- HTML data island extraction\n\nThese transformations produce different byte streams, but may use the same source file ETag.\n\n### 3. No Last-Modified Header\n**Severity**: LOW\n\nSome clients prefer `Last-Modified` over ETags. Both should be provided per best practices.\n\n### 4. Container Listing ETags\n**Severity**: MEDIUM\n\nContainer listings are dynamically generated. Current implementation may use directory `mtime`, but this doesn't reflect:\n- File additions/deletions\n- Nested container changes\n- ACL changes affecting visibility\n\n### 5. Synchronous Hash Calculation\n**Severity**: LOW (Performance)\n\nRuns synchronously on every `stat()` call. For high-traffic servers, this could become a bottleneck.\n\n### 6. Cache-Control Strategy Missing\n**Severity**: MEDIUM\n\nNo systematic `Cache-Control` headers on resource responses. This means:\n- Browsers may cache indefinitely (heuristic caching)\n- Or revalidate on every request (no caching benefit)\n- CDNs can't optimize caching\n\n---\n\n## Web Best Practices\n\n### RFC 7232 - Conditional Requests\n- Strong ETags: Byte-for-byte identical representations\n- Weak ETags: Semantically equivalent (use `W/` prefix)\n- `If-Match`: For safe mutations (optimistic concurrency)\n- `If-None-Match`: For caching (GET) or create-only (PUT)\n\n### RFC 7234 - HTTP Caching\n- `Cache-Control`: Primary caching directive\n- `ETag` + `Cache-Control`: Work together for efficient revalidation\n- `Last-Modified`: Fallback for clients not supporting ETags\n\n### Industry Recommendations\n\n| Source | Recommendation |\n|--------|----------------|\n| **MDN** | Use both ETag and Last-Modified; combine with Cache-Control |\n| **Cloudflare** | Strong ETags for byte-identical; weak for semantic equivalence |\n| **Fastly** | Content hash for strong ETags; metadata for weak |\n| **Google** | Set explicit Cache-Control; don't rely on heuristics |\n\n---\n\n## Proposed Strategy\n\n### 1. ETag Generation Tiers\n\n**Tier 1: Strong ETag (content-based)**\n- Use for: Static files where content hash is feasible\n- Algorithm: SHA-256, base64url encoded, 27 chars\n\n**Tier 2: Weak ETag (metadata-based)**\n- Use for: Large files, dynamic content, containers\n- Format: `W/\"hash\"` with mtime + size + extras\n\n**Tier 3: Version ETag (for transformed content)**\n- Use for: Content negotiation results\n- Format: `W/\"hash\"` derived from source ETag + transformation type\n\n### 2. ETag Strategy by Resource Type\n\n| Resource Type | ETag Strategy | Rationale |\n|---------------|---------------|-----------|\n| Small files (\u003c1MB) | Strong (content hash) | Accurate, worth the compute |\n| Large files (\u003e1MB) | Weak (metadata) | Too expensive to hash |\n| Containers | Weak (mtime + child count) | Dynamic, changes frequently |\n| Conneg results | Weak (source + transform) | Derived content |\n| Mashlib/UI | Weak (version) | Static but frequently updated |\n\n### 3. Cache-Control Strategy\n\n| Profile | Cache-Control Value | Use Case |\n|---------|---------------------|----------|\n| `resource` | `private, no-cache, must-revalidate` | User-generated content |\n| `container` | `private, no-cache, must-revalidate` | Container listings |\n| `static` | `public, max-age=3600, stale-while-revalidate=86400` | Mashlib, schemas |\n| `immutable` | `public, max-age=31536000, immutable` | Versioned assets |\n| `sensitive` | `private, no-store` | Credentials, tokens |\n| `discovery` | `public, max-age=3600` | Well-known endpoints |\n\n### 4. Last-Modified Header\n\nAdd `Last-Modified` to all resource responses using `stats.mtime.toUTCString()`.\n\n### 5. Vary Header for Content Negotiation\n\nWhen content negotiation is enabled, add:\n```\nVary: Accept, Accept-Language\n```\n\nThis tells caches that different `Accept` headers produce different responses.\n\n---\n\n## Implementation Plan\n\n### Phase 1: Foundation\n- [ ] Create `src/utils/etag.js` with tiered generation functions\n- [ ] Create `src/utils/caching.js` with cache profiles\n- [ ] Replace MD5 with SHA-256 in etag generation\n- [ ] Switch large files to weak ETags\n\n### Phase 2: Headers\n- [ ] Add `Last-Modified` header to all resource responses\n- [ ] Implement `Cache-Control` profiles by resource type\n- [ ] Add `Vary` header for conneg responses\n\n### Phase 3: Content-Based ETags\n- [ ] Implement content hashing for small files (\u003c1MB threshold)\n- [ ] Cache computed ETags to avoid repeated hashing\n- [ ] Add async ETag computation option\n\n### Phase 4: Container ETags\n- [ ] Improve container ETag calculation (include child count, newest mtime)\n- [ ] Consider membership hash for accurate container ETags\n\n### Phase 5: Conneg ETags\n- [ ] Generate distinct ETags for transformed content\n- [ ] Include transformation type in ETag calculation\n\n---\n\n## Configuration Options\n\n```json\n{\n \"etag\": {\n \"algorithm\": \"sha256\",\n \"strongThreshold\": 1048576,\n \"cacheEtags\": true,\n \"cacheMaxSize\": 10000\n },\n \"caching\": {\n \"defaultProfile\": \"resource\",\n \"staticMaxAge\": 3600,\n \"immutableAssets\": false\n }\n}\n```\n\n---\n\n## Comparison Matrix\n\n### Current vs Proposed\n\n| Aspect | Current | Proposed |\n|--------|---------|----------|\n| ETag algorithm | MD5 | SHA-256 |\n| ETag basis | Metadata only | Content (small) / Metadata (large) |\n| ETag type | Always strong | Strong or weak based on accuracy |\n| Last-Modified | ❌ Missing | ✅ Always included |\n| Cache-Control | ❌ Inconsistent | ✅ Profile-based |\n| Vary header | ❌ Missing | ✅ For conneg |\n| Container ETags | Basic mtime | Enhanced (children, membership) |\n| Conneg ETags | Source ETag | Distinct per transformation |\n\n### Solid Ecosystem Comparison\n\n| Server | ETag Strategy |\n|--------|---------------|\n| **Node Solid Server** | Content hash (MD5) |\n| **Community Solid Server** | Content hash + representation metadata |\n| **ESS (Inrupt)** | Proprietary, content-based |\n| **JSS (current)** | Metadata-based MD5 |\n| **JSS (proposed)** | Tiered: content/metadata with proper typing |\n\n---\n\n## Testing Plan\n\n### Unit Tests\n- Strong ETag format validation\n- Weak ETag format validation\n- Different ETags for conneg transforms\n- 304 responses for matching ETags\n- 412 responses for If-Match mismatch\n\n### Integration Tests\n- [ ] CDN compatibility (Cloudflare, Fastly)\n- [ ] Browser caching behavior\n- [ ] Concurrent edit scenarios (If-Match)\n- [ ] Solid app compatibility (SolidOS, Penny, etc.)\n\n---\n\n## Security Considerations\n\n1. **ETag as fingerprint**: ETags can be used to track users across requests. Mitigated by using `private` in Cache-Control.\n\n2. **Timing attacks**: Content-based ETags reveal if content changed. This is inherent to caching and generally acceptable.\n\n3. **ETag collision**: SHA-256 with 162+ bits is collision-resistant. MD5 collisions are feasible but unlikely to be exploited via ETags.\n\n---\n\n## References\n\n- [RFC 7232 - Conditional Requests](https://datatracker.ietf.org/doc/html/rfc7232)\n- [RFC 7234 - HTTP Caching](https://datatracker.ietf.org/doc/html/rfc7234)\n- [MDN - ETag](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag)\n- [MDN - HTTP Conditional Requests](https://developer.mozilla.org/en-US/docs/Web/HTTP/Conditional_requests)\n- [Cloudflare - ETag Headers](https://developers.cloudflare.com/cache/reference/etag-headers/)\n- [Fastly - ETags: What they are](https://www.fastly.com/blog/etags-what-they-are-and-how-to-use-them)\n- [Google Web Fundamentals - HTTP Caching](https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching)\n\n---\n\n## Related Issues\n\n- #100 - Production Readiness (ETag caching mentioned in performance section)\n- #99 - Docker (caching affects container behavior)\n","author":{"url":"https://github.com/melvincarvalho","@type":"Person","name":"melvincarvalho"},"datePublished":"2026-01-19T13:41:35.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":0},"url":"https://github.com/101/JavaScriptSolidServer/issues/101"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:ba60daf6-5878-39f9-6b38-2afc9da63a42 |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | E008:AB4C6:48365A:6383A2:69774E11 |
| html-safe-nonce | f794edfd94392e5729ab32ba51f74dee7e5af8f4637c39767e30a6246a2902e1 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJFMDA4OkFCNEM2OjQ4MzY1QTo2MzgzQTI6Njk3NzRFMTEiLCJ2aXNpdG9yX2lkIjoiNzg5NDExNzg1NTY1NjM2NTU4NSIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | 716cccac26f3ae0e3040368f91a7188854f540848a441b02ad081ced889f36ed |
| hovercard-subject-tag | issue:3829791110 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/JavaScriptSolidServer/JavaScriptSolidServer/101/issue_layout |
| twitter:image | https://opengraph.githubassets.com/7a6bd93180a8c7841da78ecedb227f0b9d241a6b87365c66b5d0e9004c03c997/JavaScriptSolidServer/JavaScriptSolidServer/issues/101 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/7a6bd93180a8c7841da78ecedb227f0b9d241a6b87365c66b5d0e9004c03c997/JavaScriptSolidServer/JavaScriptSolidServer/issues/101 |
| og:image:alt | Summary This issue proposes a comprehensive ETag strategy for JavaScriptSolidServer aligned with HTTP caching best practices. ETags are critical for efficient caching, bandwidth reduction, and prev... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | melvincarvalho |
| hostname | github.com |
| expected-hostname | github.com |
| None | 3310064f35a62c06a4024ba37f41c06836f39376a095c2dfd2c4b693c34965be |
| turbo-cache-control | no-preview |
| go-import | github.com/JavaScriptSolidServer/JavaScriptSolidServer git https://github.com/JavaScriptSolidServer/JavaScriptSolidServer.git |
| octolytics-dimension-user_id | 205442424 |
| octolytics-dimension-user_login | JavaScriptSolidServer |
| octolytics-dimension-repository_id | 958025407 |
| octolytics-dimension-repository_nwo | JavaScriptSolidServer/JavaScriptSolidServer |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 958025407 |
| octolytics-dimension-repository_network_root_nwo | JavaScriptSolidServer/JavaScriptSolidServer |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 67d5f8d1d53c3cc4f49fc3bb8029933c3dc219e6 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width