Nirmalya Ghosh Applied AI | Technologist

Schema Pruning for Text-to-SQL: 93% Less Context, Zero LLM Calls

Schema Pruning for Text-to-SQL: 93% Less Context, Zero LLM Calls

In Part 1, the naïve Text-to-SQL approach sent 8,414 tokens of schema context to generate 16 tokens of SQL - a 526:1 input-to-output ratio. This post engineers the fix: a deterministic schema pruner - context engineering at the schema layer - that selects only the tables relevant to each query, with no LLM dependency.

Continue reading ...

Text-to-SQL the Naïve Way: Why Most Demos Fail in Production

Text-to-SQL the Naïve Way: Why Most Demos Fail in Production

The promise of Text-to-SQL is compelling: let anyone query a database using plain English. The reality is that most implementations silently return wrong data, expose sensitive information, and cost more than they should.

Continue reading ...

TTFT Optimisation: Practical Patterns

How to reduce TTFT in production: practical patterns, implementation strategies, and edge cases to watch for.

Continue reading ...

How Prompt Size Directly Impacts LLM Response Latency

Understanding the mechanics of Time to First Token (TTFT) and why those extra tokens may lead to poor user experience (UX).

Continue reading ...

A Newsletter Decluttering AI Agent Using ReAct Pattern

Our inboxes contain dozens (if not hundreds) of newsletters we subscribed to during moments of curiosity, but we seldom read most of them. Manually unsubscribing is tedious: open each email, scroll to the bottom, click unsubscribe, confirm … repeat 50+ times.

This post covers a personal project developing an AI agent using the ReAct pattern to analyse newsletters I have subscribed to and recommend the ones to unsubscribe based on my reading behaviour.

Continue reading ...