How do you verify output of thing more intelligent that you ?
That’s Dixit’s dilemma of Generative AI. Its not famous yet 🙂 .
But as we employ Generative AI more this dilemma will become famous . Take the case of code generated by GenAI , you need the developer using it , to be more skilled than the complexity of the code being generated to find out correctness of the output .
A corollary of Dixit’s dilemma is that our Qualitative gains from GenAI are capped by the Quality of the verifier !
Did someone talk of GenAI making your job redundant ? It can in the quantitative domain but not in qualitative domain (it the mundane can be but the classy can’t be ).
EDIT : Some more clarifications
The holy grail of code generation .
The holy grail that we all are chasing from tools like copilot is that they can generate the whole code from requirement documents .This is not a stated need but that is what people are hoping for . read on .
If you see the popular demos on code generation by GenAI tools there are few scenarios that occur often :
1. Given a table or json generate complete webservice
2. Find the performance or security bugs in code
3. Find structural issues with code as per the language specification
4. Flag issues with dependencies from deep within codebase
5. Document my code
6. Write test cases , elementary ones .
Now if you see the capabilities of IDEs , code generators ,profilers ,linters such things existed since decades for most of the programming languages .But they were in siloed in nature .So we can give credits to LLMs tools for bringing it all together .This can benefit existing codebase a lot .
The situation becomes interesting for new code base . The demos where one can make the tool write a script for managing a server or sorting algorithm are in essence demonstration of smart lookup (for relevant code fragment ).This can benefit seasoned developers by saving few keystrokes .What next ?
What do we all expect next is , can the tool take long format requirements and generate functional code .That is to say can it read my Jira requirements and understand my user persona , interfaces , domain jargon ,different flows so on and so forth . It is certainly passible to give lots of prompt context to AI tools and create such a demo .
Can we do it at scale for an organization ?
Can we do it in a manner where prompting the requirement context doesn’t becomes as complex a job as that of programmer ? AND
How can we validate that what ever is generated is functionally correct without needing a new battalion of validators ?
These are the practical question for which we don’t have answers .
This is not to say its not passible , domain languages for automatic code generation has been in place for long . Coupled with GenAI they can do wonders .Its just the current breed of GenAI are not made for this sole purpose .The day that happens , all the GenAI demos will happen to CTOs instead of CEOs 🙂 .Read that again 🙂
One reply on “Dixit’s dilemma of Generative AI”
How many genius people required per hundred developers , to do good quality prompts engineering and verify the output?
May be answer is less than 10.