Singapore · Methodology
How we build the Singapore life insurance policy library
Every fact rendered on lifeinsurance.com.sg traces back to a public Product Summary published by a MAS-licensed insurer on compareFIRST.sg. This page documents the end-to-end pipeline so anyone — buyers, advisers, journalists, regulators — can audit how a given fact got onto the site.
Current snapshot
- 15 MAS-licensed Singapore life insurers covered
- 50 active products ingested into the policy library
- 20 cross-insurer comparison topics
- 2,450 wording clause chunks embedded for semantic search
- Snapshot generated: 2026-06-05
1. Source authority
Our authoritative source is compareFIRST.sg — the Life Insurance Association (LIA) / Monetary Authority of Singapore (MAS) sanctioned comparison portal. Every MAS-licensed Singapore life insurer is required to publish a standardised Product Summary for every active retail product on the portal. These PDFs are the authoritative reference for product features, premium illustrations, exclusions and the full fee schedule.
Where compareFIRST.sg doesn't carry a specific product (typically adviser-only or institutional-only structures), we cite the insurer's own published Product Summary or Product Highlights Sheet, with the same source-link and ingestion-date discipline.
2. Ingestion pipeline
For each Product Summary PDF, the automated pipeline:
- Fetches the PDF from the source URL, validates the content-type, and computes a sha256 hash of the bytes. The hash is the dedupe key — re-ingesting an unchanged PDF is a no-op.
- Extracts markdown text from the PDF via Anthropic Claude Haiku 4.5 vision (selected for accurate table extraction from MAS-format Product Summaries).
- Extracts structured facts from the markdown via Anthropic Claude Sonnet 4.6, against a Zod-validated schema covering 20+ canonical topics (exclusions, riders, free-look period, multipliers, premium term, surrender value, CPF eligibility, etc.).
- Stores the result in Supabase tables —
insurance.sg_life_wordingsfor the markdown + metadata,insurance.wording_factsfor the structured extraction, with a confidence-tier classifier (verified / inferred). - Embeds clause chunks via OpenAI text-embedding-3-small (1536d). Each ingested wording is sectioned by heading, then chunked to ~500 tokens, then embedded for pgvector semantic search at /clause-search/.
The entire pipeline is idempotent — re-running against the same source PDFs is a no-op (pdf_hash dedupe). When a new Product Summary is published (e.g. a December revision), the new PDF gets a new hash and is ingested fresh; the prior version is retained with its superseded_at timestamp set.
3. Verification
Every wording row on the site carries:
- An
effective_fromdate — when the policy version becomes the current wording. - A
source_url— the public PDF on compareFIRST.sg. - A
pdf_hash— sha256 of the source PDF bytes. - An
ingested_attimestamp — when our pipeline last processed the source. - A confidence tier on each fact — verified (extracted from the policy wording with high cross-validation) or inferred (extracted with lower confidence; surface only when other evidence supports).
A pre-commit audit script (audit-wording-integrity.cjs) hard-fails any deploy if any product on the policy library is missing source_url / pdf_hash / effective_from. The site never ships an unsourced wording.
4. Refresh cadence
compareFIRST.sg publishes Product Summary revisions on an irregular cadence — typically once per year per product, sometimes more often when an insurer updates terms. Our refresh schedule:
- Continuous — when the user-facing site notices a stale source URL, that product is requeued for re-ingestion.
- Quarterly — full policy-library refresh, every product re-fetched and re-extracted.
- Ad-hoc — when a buyer or adviser flags a discrepancy via hello@lifeinsurance.com.sg, the specific product is prioritised.
5. What we don't publish
We do not publish:
- Fabricated star ratings or fake review counts. We do not publish AggregateRating JSON-LD without a real underlying survey + disclosed methodology.
- Indicative premium quotes presented as binding. Premiums are highly underwriting-dependent — quote illustrations on Product Summaries are non-binding scenarios.
- Claims that lifeinsurance.com.sg is a MAS-licensed financial adviser. We are not.
- Claims that lifeinsurance.com.sg is an insurer. We are not.
- Subjective rankings ("best", "leading", "premier", "#1") without disclosed methodology. CPFTA + ASAS hard rule.
6. Corrections
If you spot an extraction error or a stale source URL, please email hello@lifeinsurance.com.sg. We respond to factual corrections within 1 business day and re-ingest the affected product immediately.
Insurers wishing to flag inaccuracies in their product representation may also contact us at the same address. We hold no exclusive distribution agreements and have no commercial bias toward or against any insurer — accuracy is the only criterion.
7. Singapore legal context
Lifeinsurance.com.sg operates under Singapore law, including the Financial Advisers Act (FAA), the Insurance Act 1966, the Personal Data Protection Act 2012 (PDPA), the Consumer Protection (Fair Trading) Act (CPFTA), and ASAS advertising-standards guidance. We are not a MAS-licensed financial adviser. We are not an insurer. We are not a licensed insurance broker. We are an independent comparison surface that publishes structured-data summaries of public insurer disclosures.
For binding interpretation of any policy term, always read the full Policy Contract in addition to the Product Summary, and consult the insurer or a MAS-licensed financial adviser.